40 Commits

Author SHA1 Message Date
Wan-Teh Chang
18f9110516 Avoid AVX instructions in ScaleRowUp2_Linear_SSSE3
The "vpackuswb   %%xmm2,%%xmm0,%%xmm0" and "vmovdqu     %%xmm0,(%1)"
instructions in ScaleRowUp2_Linear_SSSE3() are AVX instructions. They
cause an illegal instruction exception on CPUs that do not support AVX.

Bug: libyuv:927
Bug: chromium:1312551
Change-Id: I87b2aaf041e7d185e7e8fb07172d4f37482e9d08
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3585881
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Wan-Teh Chang <wtc@google.com>
2022-04-15 00:18:39 +00:00
Frank Barchard
49ebc996aa Make 2 step transitive tests measure 2 step time.
Add tests of all macros used by libyuv public headers

When a 1 step conversion is added, a 2 step test can compare
the old 2 step method to the 1 step.  A 1 step unittest is
also added which compares C to SIMD.  Making the 2 step
conversions measure performance of the 2 steps allows the
old 2 step performance to be compared to 1 step.

All macros used in public headers are added to an ifdef test.
Showing them in a unittest allows some diagnostics when
a test is failing.

Bug: libyuv:901
Change-Id: I7ffa6ed0cb3b506fa1b7fd4b7b1b729658c3c266
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2857916
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2021-04-30 18:14:57 +00:00
Frank Barchard
5e05f26a2b Switch win32 to row_gcc for clangcl.
Bug: libyuv:900, libyuv:848, b/178283356, b/185922513
Change-Id: I7697953753391c555a778198db36412c853fb29e
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2844962
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
2021-04-22 19:32:32 +00:00
Yuan Tong
c41eabe3d4 Add full 16 bit scaling up by 2x function
R=fbarchard@chromium.org

Change-Id: I4a869aefdc16e34357a615727711594c5d8e3a80
Bug: libyuv:882
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2719842
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2021-03-02 19:29:02 +00:00
Frank Barchard
d768774299 add yuvconvstants util
miscellaneous cleanup of other code/comments

Bug: libyuv:873, libyuv:877
Change-Id: I0d8caf9a65908ff8898b25494f7c724775f84fa3
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2692930
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2021-02-12 19:45:16 +00:00
Yuan Tong
d4ecb70610 Add P010ToP410 and P210ToP410
These are 16 bit bi-planar convert functions to scale UV plane to
Y plane's size using (bi)linear filter.

libyuv_unittest --gtest_filter=*ToP41*

R=fbarchard@chromium.org

Bug: libyuv:872
Change-Id: I3cb4fafe2b2c9eedd0d91cf4c619abb9ee107bc1
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2690102
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2021-02-12 14:55:24 +00:00
Frank Barchard
12a4a2372c Rounding added to scaling upsampler
Bug: libyuv:872, b/178521093
Change-Id: I86749f73f5e55d5fd8b87ea6938084cbacb1cda7
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2686945
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2021-02-10 18:51:02 +00:00
Yuan Tong
f7fc83f46d Add NV12ToNV24 and NV16ToNV24
These are bi-planar convert functions to scale UV plane to Y plane's size using (bi)linear filter.

libyuv_unittest --gtest_filter=*ToNV24*

R=fbarchard@chromium.org

Change-Id: I3d98f833feeef00af3c903ac9ad0e41bdcbcb51f
Bug: libyuv:872
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2682152
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2021-02-09 07:38:40 +00:00
Frank Barchard
942c508448 BT.2020 Full Range yuvconstants
new color util to compute constants needed based on white point.

[ RUN      ] LibYUVColorTest.TestFullYUVV
hist	      -2	      -1	       0	       1	       2
red	       0	 1627136	13670144	 1479936	       0
green	  319285	 3456836	 9243059	 3440771	  317265
blue	       0	 1561088	14202112	 1014016	       0

Bug: libyuv:877, b/178283356
Change-Id: If432ebfab76b01302fdb416a153c4f26ca0832d6
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2678859
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2021-02-06 00:26:55 +00:00
Yuan Tong
fc61dde1eb Add special optimization for I420ToI444 and I422ToI444
These functions use (bi)linear filter, to scale U and V planes to the size of Y plane.
This will help enhance the quality of YUV to RGB conversion.

Also added 10bit and 12bit version:
I010ToI410
I210ToI410
I012ToI412
I212ToI412

libyuv_unittest --gtest_filter=LibYUVConvertTest.I42*ToI444*:LibYUVConvertTest.I*1*ToI41*

R=fbarchard@chromium.org

Change-Id: Ie4a711a5ba28f2ff1f44c021f7a5c149022264c5
Bug: libyuv:872
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2658097
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2021-02-03 10:53:02 +00:00
Frank Barchard
b7a1c5ee5d Scale by even factor low level row function
Bug: b/171884264
Change-Id: I6a94bde0aa05e681bb4590ea8beec33a61ddbfc9
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2518361
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2020-11-03 21:25:18 +00:00
Frank Barchard
cec28e7088 PlaneScale, UVScale and ARGBScale test 3x and 4x down sample.
Intel SkylakeX
UVTest3x (1925 ms)
UVTest4x (2915 ms)
PlaneTest3x (2040 ms)
PlaneTest4x (4292 ms)
ARGBTest3x (2079 ms)
ARGBTest4x (1854 ms)

Pixel 2
ARGBTest3x (3602 ms)
ARGBTest4x (4064 ms)
PlaneTest3x (3331 ms)
PlaneTest4x (8977 ms)
UVTest3x (3473 ms)
UVTest4x (6970 ms)

Bug: b/171798872, b/171884264
Change-Id: Iebc70fed907857b6cb71a9baf2aba9861ef1e3f7
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2505601
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2020-10-28 20:41:59 +00:00
Frank Barchard
a4ec5cf9c2 UVScale down use AVX2 and Neon for aarch32
Intel SkylakeX
Was SSSE3 UVScaleDownBy4_Box (2496 ms)
Now AVX2  UVScaleDownBy4_Box (1983 ms)

Was SSSE3 UVScaleDownBy2_Box (380 ms)
Now AVX2  UVScaleDownBy2_Box (360 ms)

Pixel 4 aarch32
Was UVScaleDownBy4_Box (4295 ms)
Now UVScaleDownBy4_Box (3307 ms)

Was UVScaleDownBy2_Box (1022 ms)
Now UVScaleDownBy2_Box (778 ms)

Bug: libuyv:838
Change-Id: Ic823fa15e5761c1b9a897da27341adbf1ed39883
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2470196
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2020-10-14 06:23:26 +00:00
Frank Barchard
d730dc2f18 2x down sample for UV planes ported to SSSE3 / NEON
Bug: libuyv:838
Change-Id: Id9fb3282a3e86143d76b5e0cb557f0523a88b3c8
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2465578
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2020-10-13 21:42:15 +00:00
Frank Barchard
413a8d8041 Add AYUVToNV12 and NV21ToNV12
BUG=libyuv:832
TESTED=out/Release/libyuv_unittest --gtest_filter=*ToNV12* --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=1000 --libyuv_flags=-1 --libyuv_cpu_info=-1

R=rrwinterton@gmail.com

Change-Id: Id03b4613211fb6a6e163d10daa7c692fe31e36d8
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1560080
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2019-04-12 17:48:45 +00:00
Frank Barchard
664c735677 I420ToYUY2_AVX2 port
I420 and I422 To YUY2 and UYVY ported from SSE2 to AVX2.

Was SSE2
I420ToYUY2_Opt (135 ms)
I420ToUYVY_Opt (148 ms)
I422ToYUY2_Opt (145 ms)
I422ToUYVY_Opt (142 ms)

Now AVX2
I420ToYUY2_Opt (133 ms)
I420ToUYVY_Opt (130 ms)
I422ToYUY2_Opt (127 ms)
I422ToUYVY_Opt (137 ms)

Bug: libyuv:556
Test: out/Release/libyuv_unittest --sandbox_unittests --gtest_filter=*I42?To*UY*Opt
Change-Id: Ic35f97cee02dc009fd98785589ba17c7cf50bb35
Reviewed-on: https://chromium-review.googlesource.com/892493
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2018-02-01 00:33:25 +00:00
Frank Barchard
92e22cf5b6 Lint cleanup after C99 change CL
TBR=braveyao@chromium.org
Bug: libyuv:774
Test: git cl lint
Change-Id: I51cf8107a8db17fbc9952d610f3e4d7aac5aa743
Reviewed-on: https://chromium-review.googlesource.com/882217
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-01-24 19:16:03 +00:00
Frank Barchard
7e389884a1 Switch to C99 types
Append _t to all sized types.
uint64 becomes uint64_t etc

Bug: libyuv:774
Test: try bots build on all platforms
Change-Id: Ide273d7f8012313d6610415d514a956d6f3a8cac
Reviewed-on: https://chromium-review.googlesource.com/879922
Reviewed-by: Miguel Casas <mcasas@chromium.org>
2018-01-23 19:16:05 +00:00
Frank Barchard
ecab5430c2 Remove MEMOPREG x64 NaCL macros
MEMOPREG macros are deprecated in row.h

Regular expressions to remove MEMOPREG macros:

MEMOPREG(movd, 0x00, [u_buf], [v_buf], 1, xmm1)                            \
MEMOPREG\((.*), (.*), (.*), (.*), (.*), (.*)\)
"\1    \2(%\3,%\4,\5),%%\6            \\n"

MEMOPREG(movdqu,0x00,1,4,1,xmm2)
MEMOPREG\((.*),(.*),(.*),(.*),(.*),(.*)\)
"\1    \2(%\3,%\4,\5),%%\6            \\n"

MEMOPREG(movdqu,0x00,1,4,1,xmm2)
MEMOPREG\((.*),(.*),(.*),(.*),(.*),(.*)\)(.*)(//.*)
"\1    \2(%\3,%\4,\5),%%\6           \\n"

TBR=braveyao@chromium.org

Bug: libyuv:702
Test: try bots pass
Change-Id: If8743abd9af2e8c549d0c7d3d49733a9b0f0ca86
Reviewed-on: https://chromium-review.googlesource.com/865964
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2018-01-16 19:10:44 +00:00
Frank Barchard
5088f00165 Remove MEMACCESS x64 NaCL macros
MEMACCESS macros are deprecated in row.h

Usage examples
    "movdqu    " MEMACCESS(0) ",%%xmm0         \n"
    "movdqu    " MEMACCESS2(0x10,0) ",%%xmm1   \n"

Regular expressions to remove MEMACCESS macros:

" MEMACCESS2\((.*),(.*)\) "(.*)\\n"
\1(%\2)\3              \\n"

" MEMACCESS\((.*)\) "(.*)\\n"
(%\1)\2            \\n"

Bug: libyuv:702
Test: try bots pass
Change-Id: I42f62d5dede8ef2ea643e78c204371a7659d25e6
Reviewed-on: https://chromium-review.googlesource.com/862803
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2018-01-12 20:37:41 +00:00
Frank Barchard
e3797d1765 Remove MEMOPARG x64 NaCL macros
MEMOPARG macros are deprecated in row.h

  #opcode " " #offset "(%" #base ",%" #index "," #scale "),%" #arg "\n"

Usage examples
    MEMOPARG(movzwl,0x00,1,3,1,k2)             //  movzwl  (%1,%3,1),%k2

Regular expression to remove MEMACCESS macro:

MEMOPARG\((.*),(.*),(.*),(.*),(.*),(.*)\)(.*//.*)
"\1    \2(%\3,%\4,\5),%\6                \\n"

Bug: libyuv:702
Test: try bots pass
Change-Id: I4a5ad2abf5017e651576f4c8c784be1c8dbf5a83
Reviewed-on: https://chromium-review.googlesource.com/863108
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-01-12 18:26:06 +00:00
Frank Barchard
3694891922 Remove MEMLEA x64 NaCL macros
Bug: libyuv:702
Test: try bots pass
Change-Id: I0ee094551734368f2179c298e7bf423ec80a929c
Reviewed-on: https://chromium-review.googlesource.com/857845
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-01-10 19:16:16 +00:00
Frank Barchard
55310f92bc Remove NACL_R14 macro
Bug: libyuv:702
Test: try bots still build
Change-Id: I05317e45c885955fcda233bdddbd11ce1d246d90
Reviewed-on: https://chromium-review.googlesource.com/854770
Reviewed-by: Weiyong Yao <braveyao@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2018-01-08 22:41:15 +00:00
Lei Zhang
8445617191 Mark a bunch of kArray variables as const.
This allows the linker to move the variables from the .data section to
the .rodata section.

Bug: libyuv:254
Test: out/Release/libyuv_unittest --gtest_filter=* --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=999 --libyuv_flags=-1 --libyuv_cpu_info=-1

Change-Id: I6998570f1af4337d7b80313d9e18e36aa20d6ec0
Reviewed-on: https://chromium-review.googlesource.com/777033
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2017-11-27 23:38:44 +00:00
Frank Barchard
bbe8c233f2 scale warning fixes for unused parameters
BUG=libyuv:680
TEST=builds and runs with no warnings

Change-Id: I7d60ef44292fa6ad4f7c4e2e2657359b864d2dab
Reviewed-on: https://chromium-review.googlesource.com/442670
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
2017-02-15 21:38:59 +00:00
Frank Barchard
e62309f259 clang-format libyuv
BUG=libyuv:654
R=kjellander@chromium.org

Review URL: https://codereview.chromium.org/2469353005 .
2016-11-07 17:37:23 -08:00
Frank Barchard
137aa63afe Fix some comment typos
BUG=None
TEST=try bots

Review URL: https://codereview.chromium.org/2346633002 .
2016-09-15 15:38:19 -07:00
Frank Barchard
1b3e4aee47 make count a memory variable for 32 bit
32 bit clang runs out of registers and compiler does core dump.
force 32 bit build to use memory variable for counter.

BUG=libyuv:612
TBR=harryjin@google.com

Review URL: https://codereview.chromium.org/2091913003 .
2016-06-23 20:42:10 -07:00
Frank Barchard
cc88adc620 YUV scale filter columns improved filtering accuracy
upscale a YUV image.  observe change in hue.. green especially.
disable ScaleFilterCols_SSSE3, falling back on ScaleFilterCols_C
observe hue.. green especially, is better.

was ScaleFrom1280x720_Bilinear (1620 ms)
now ScaleFrom1280x720_Bilinear (1907 ms)

BUG=libyuv:605
TEST=try bots
R=harryjin@google.com, wangcheng@google.com

Review URL: https://codereview.chromium.org/2084533006 .
2016-06-23 20:16:55 -07:00
Frank Barchard
cf101116c9 Remove initialize to zero on output variables for inline.
Inline that uses temporary variables is currently initializing them
to 0 and passing in as output "+r".
This CL replaces the output constraint to "=&r" for most meaning an
output with early write (before inputs).  This allows the initialize
to zero step to be removed, saving 1 instruction.

BUG=libyuv:580
TESTED=local libyuv build on gcc/linux and try bots
R=harryjin@google.com

Review URL: https://codereview.chromium.org/1895743008 .
2016-04-18 16:24:26 -07:00
Frank Barchard
36615d62a0 fix for InterpolateRow_AVX2
port scaledownby4_avx2 to gcc

TBR=harryjin@google.com
BUG=libyuv:492

Review URL: https://codereview.chromium.org/1546763002 .
2015-12-22 12:29:54 -08:00
Frank Barchard
80ca4514ef change scale down by 4 to use rounding.
TBR=harryjin@google.com
BUG=libyuv:447

Review URL: https://codereview.chromium.org/1525033005 .
2015-12-15 21:25:18 -08:00
Frank Barchard
70445ef2ef avx2 scale down by 2 for gcc
R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:527

Review URL: https://codereview.chromium.org/1520423003 .
2015-12-15 10:59:20 -08:00
Frank Barchard
ae55e41851 use rounding in scaledown by 2
When scaling down by 2 the formula should round consistently.
(a+b+c+d+2)/4
The C version did but the SSE2 version was doing 2 averages.
avg(avg(a,b),avg(c,d))
This change uses a sum, then rounds.

R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:447,libyuv:527

Review URL: https://codereview.chromium.org/1513183004 .
2015-12-14 17:25:36 -08:00
Frank Barchard
3e38762d6b fix avx2 box filter bug for yuv down sampling.
offset to second group of pixels was off by 16.
should have been 32, not 16.
requires avx2 hardware and wide image for test.

R=harryjin@google.com
TBR=harryjin@google.com
BUG=libyuv:492,libyuv:501

Review URL: https://codereview.chromium.org/1395603002 .
2015-10-07 11:02:33 -07:00
Frank Barchard
914a9856c7 Reimplement NV21ToARGB to allow different color matrix.
Low level for NV21ToARGB written to accept yuv matrix used by
other YUV to ARGB functions.
Previously NV21 was implemented for Windows using NV12 with a different
matrix that swapped U and V.  But the Arm version of the low level does
not allow the matrix U and V contributions to be swapped.
Using a new low level function that reads NV21 and uses the same
yuvconstants as other YUV conversion functions allows an Arm port of
this function.

TBR=harryjin@google.com
BUG=libyuv:500

Review URL: https://codereview.chromium.org/1388273002 .
2015-10-06 20:34:44 -07:00
Frank Barchard
68fa59c873 add box scaling avx2 optimization for gcc
TBR=harryjin@google.com
BUG=libyuv:492

Review URL: https://codereview.chromium.org/1392803002 .
2015-10-06 20:01:02 -07:00
Frank Barchard
d70293993f port scale box filter sse2 to gcc
TBR=harryjin@google.com
BUG=libyuv:492

Review URL: https://codereview.chromium.org/1393653002 .
2015-10-06 16:54:26 -07:00
Frank Barchard
94d4269936 clang use scalewin
R=harryjin@google.com
TBR=harryjin@google.com
BUG=libyuv:469

Review URL: https://webrtc-codereview.appspot.com/51329004.
2015-08-18 14:50:27 -07:00
fbarchard@google.com
2e9f3e5cf5 rename source files from row_posix.cc etc to row_gcc.cc to avoid gyp build filtering out source files from build when on windows with clang. The source code contained in row_gcc.cc is gcc syntax inline assembly available for any platform that supports gcc or clang for intel cpus.
BUG=440
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/56579004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1430 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-06-09 17:27:52 +00:00