1334 Commits

Author SHA1 Message Date
Frank Barchard
d4c3f45eb6 libyuv r1749 upstream for I444ToNV12
Bug: libyuv:858
Change-Id: Iacf70938ace6258e5bbd397cd78414f1025474c5
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2154331
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2020-04-17 09:16:46 +00:00
Frank Barchard
7e05059557 Apply clang format to libyuv source
Bug: None
Change-Id: Ifd16b59d7f0dbf4402dd5741bb89d1ec06dfaac8
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2131868
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Hsiu Wang <hsiu@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2020-04-01 18:07:34 +00:00
Frank Barchard
7f00d67d7c Remove HAVE_JPEG requirement from headers.
JPeg is currently only enabled on Windows and Linux builds, so only
call the functions if needed and available for your target platform.

Bug: b/152178870
Change-Id: I99082d2d6b1440b26c4fe6840dfafe6fc9b1df9d
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2115190
Reviewed-by: Hsiu Wang <hsiu@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2020-03-23 21:39:53 +00:00
Frank Barchard
b5e223ac4c Upstream all libyuv changes to version 1746 Prefetch for all arm functions - helps performance at higher resolutions Make MirrorPlane function public.
Bug: libyuv:855
Change-Id: I4020face6b52767ee78d81870314285d63e98b95
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2113650
Reviewed-by: Hsiu Wang <hsiu@google.com>
2020-03-21 20:19:44 +00:00
Frank Barchard
3db22ebc4b RAWToJ400 and RGBToJ400 use 2 step row function for Intel. RAWToJ400 Was 3996 ms, now 3309. 20.7% faster.
Call a row function for each row, based on ARGBToI400 code.
But implement row functions as 2 step conversion.  Adds the
row functions:
RAWToYJ, RGBToYJ, SSSE3 and AVX2 versions, and Any versions.
The smaller row buffer is more cache friendly on large images.

The max cache size can be configured, and is currently:
// Maximum temporary width for wrappers to process at a time, in pixels.
And the row buffer is
  SIMD_ALIGNED(uint8_t row[MAXTWIDTH * 4]);
So 8192 bytes are used for the row buffer, leaving the rest for source
and destination buffers.

blaze-bin/third_party/libyuv/libyuv_test '--gunit_filter=*R*To?400_Opt' --libyuv_width=3600 --libyuv_height=2500 --libyuv_repeat=1000 --libyuv_flags=-1 --libyuv_cpu_info=-1 | sortms

Was
RAWToJ400_Opt (3996 ms)
ARGBToI400_Opt (3964 ms)
RGB24ToJ400_Opt (3960 ms)
ARGBToJ400_Opt (3909 ms)
RGBAToJ400_Opt (3885 ms)

Now
ARGBToJ400_Opt (4091 ms)
ARGBToI400_Opt (3936 ms)
RGBAToJ400_Opt (3428 ms)
RGB24ToJ400_Opt (3324 ms)
RAWToJ400_Opt (3309 ms)

Bug: libyuv:854, b/147753855
Change-Id: Ieb65fbda94e812c737f4c3c74107354b73c4bcd2
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2016203
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2020-01-23 03:23:38 +00:00
Frank Barchard
1cea4235af RAWToJ400 for big endian RGB to grey scale.
On Pixel 3
Was
BM_ConvertToGray/1280/720/3                        2360958 ns      2334984 ns         2999
BM_ConvertToGray/1279/721/3                        2360289 ns      2334329 ns         2994
BM_ConvertGrayTensorflowCoefficients/1280/720/3    2983296 ns      2947113 ns         2259
BM_ConvertGrayTensorflowCoefficients/1279/721/3    2871205 ns      2835359 ns         2170

Now
BM_ConvertToGray/1280/720/3                        2358469 ns      2334068 ns         2997
BM_ConvertToGray/1279/721/3                        2364584 ns      2336892 ns         2995
BM_ConvertGrayTensorflowCoefficients/1280/720/3     281312 ns       278244 ns        25170
BM_ConvertGrayTensorflowCoefficients/1279/721/3     351310 ns       347229 ns        20217

BUG=libyuv:854

Change-Id: If2192affc2d3219e0fb824737d75b9374a25d709
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2003236
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2020-01-16 00:29:11 +00:00
Frank Barchard
6e6f81b803 Floating point Gaussian kernels
On SkylakeX for 720p
TestGaussPlane_F32 (657 ms)

On Pixel3
TestGaussPlane_F32 (1787 ms)

Bug: libyuv:852, b/145611468
Change-Id: I9859af1b9381621067992305727da285f82bdded
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1949667
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Marat Dukhan <maratek@google.com>
2019-12-09 04:45:59 +00:00
Frank Barchard
6502179e4c I210ToAR30 support for 422 10 bit to 10 bit RGB
BUG=960620, libyuv:845, b/129864744

Change-Id: I43b152568b7f297f81624d47e56a334c127be17b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1901465
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2019-11-06 19:37:22 +00:00
Frank Barchard
1f12946068 Add U444ToABGR, J444ToABGR, H444ToABGR, H444ToARGB and ConvertToARGB support
BUG=960620, libyuv:845, b/129864744

Change-Id: I9f80cda3be8e13298c596fac514f65a23a38d3d0
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1900310
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2019-11-05 22:11:20 +00:00
Frank Barchard
53e014c99d BT.2020 pull in tests and upstream fixes; expose a few more methods.
This adds some missing prototypes from the BT.2020 CL as well as expands
the H444 and J444 results.

BUG=960620, libyuv:845, b/129864744

Change-Id: I8ea3959379f1bb2edb857d4eb90fb9a1f6aa4e03
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1899093
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2019-11-05 20:01:59 +00:00
Frank Barchard
22f8aad8bc RAWToRGBA for 3 channel OCR
Replace ARM64 only row function with high level function
that implements SSSE3, 32 bit Neon and C.

Compared to 2 step RAWToARGB + ARGBToRGBA on row level:
3.1x faster on ARM
6.2% faster on Intel

BUG=b/140748379

Change-Id: Ia8636d9e4fcdbe10b8c2e81610a54728e29845cd
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1860914
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2019-10-14 22:27:37 +00:00
Frank Barchard
fce0fed542 ARGBToY use 8 bit precision instead of 7 bit.
Neon and GCC Intel optimized, but win32 and mips not optimized.

BUG=libyuv:842, b/141482243

Change-Id: Ia56fa85c8cc1db51f374bd0c89b56d21ec94afa7
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1825642
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Miguel Casas <mcasas@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2019-10-07 23:01:10 +00:00
Frank Barchard
9b63884a3e Add ABGRToNV21 and ABGRToNV12
Fix ARGBToUVJRow_AVX2 constants for win32

BUG=libyuv:833, libyuv:839

Change-Id: Id4731a573d40d7a9b46fcc31c2fee295483e1ff6
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1739509
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Hirokazu Honda <hiroh@chromium.org>
2019-08-07 01:29:13 +00:00
Frank Barchard
fec9121b67 SwapUV AVX2 and SSSE3
Based on ARGBShuffle but with count adjusted and new shuffle mask

BUG=libyuv:809

Change-Id: Idd936ee6bedcf285607a68c2fc54d876b4becc01
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1711882
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2019-07-26 18:41:40 +00:00
Frank Barchard
f1c00932df NV21 unittest and benchmark
BUG=libyuv:809

Change-Id: I75afb5612dcd05820479848a90ad16b07a7981bc
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1707229
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2019-07-18 02:13:02 +00:00
Frank Barchard
09cfb2bbd6 Update to r1732 for more robust jpeg
Includes a rounding change for neon.
BUG=b/135532289

Change-Id: I36ffb57b55db6c64804ad169def865be1ac6d66e
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1684439
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Chong Zhang <chz@google.com>
2019-07-01 22:32:36 +00:00
Frank Barchard
af9bc4f67c RGB24ToJ420 for full range YUV
BUG=b/249563884

Change-Id: I41b45b274313ec22f5e3799000242da1ec692586
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1629527
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2019-05-29 02:40:22 +00:00
Frank Barchard
681c6c6739 Add LIBYUV_API to NV12ToABGR and I444Rotate, I444Scale
Gaussian blur low levels ported to 32 bit neon.
But they are not hooked up to anything but a unittest.

Bug:b/248041731, b/132108021, b/129908793
Change-Id: Iccebb8ffd6b719810aa11dd770a525227da4c357
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1611206
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Chong Zhang <chz@google.com>
2019-05-14 01:18:06 +00:00
Frank Barchard
413a8d8041 Add AYUVToNV12 and NV21ToNV12
BUG=libyuv:832
TESTED=out/Release/libyuv_unittest --gtest_filter=*ToNV12* --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=1000 --libyuv_flags=-1 --libyuv_cpu_info=-1

R=rrwinterton@gmail.com

Change-Id: Id03b4613211fb6a6e163d10daa7c692fe31e36d8
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1560080
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2019-04-12 17:48:45 +00:00
Frank Barchard
5b6042fa0d add YUV24 and AYUV formats
Alternatives to RGB24 and AYUV for working with GPU.

BUG=libyuv:832
TESTED=out/Release/libyuv_unittest --gtest_filter=*NV21To???24* --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=1000 --libyuv_flags=-1 --libyuv_cpu_info=-1
R=rrwinterton@gmail.com

Change-Id: I5559c63f4bd4c847492fcb1571f7b03c58146689
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1501735
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2019-03-05 02:53:56 +00:00
Frank Barchard
12f9b5f351 Add commment for jpeg parameters.
Bug: None
Test: Try bots
Change-Id: I7b90731e828169af96b3e0b8f8821635cff57755
Reviewed-on: https://chromium-review.googlesource.com/c/1308819
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-11-01 18:18:50 +00:00
Frank Barchard
b36c86fdfe Port box filter to NEON
Bug: libyuv:821
Change-Id: I4a6b9bee2c2fae199c73c9ec7ecb32bde37c1852
Tested: out/Release/libyuv_unittest --gtest_filter=*ScaleFrom1920x1080_Box --libyuv_width=160 --libyuv_height=90 --libyuv_repeat=1000
Reviewed-on: https://chromium-review.googlesource.com/c/1298598
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Miguel Casas <mcasas@chromium.org>
2018-10-25 18:56:29 +00:00
Frank Barchard
b416d36c89 disable ARGBToRAWRow_AVX2 and ARGBToRGB24Row_AVX2
Bug: b:118386049
Change-Id: I3cf46f0f1a9f24523d5b1c86e9201b92a5bd32b0
Tested: out/Release/libyuv_unittest --gtest_filter=*ARGBToRAW*
Reviewed-on: https://chromium-review.googlesource.com/c/1296803
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
2018-10-24 22:11:24 +00:00
Frank Barchard
1fe0613c3f MJPGToNV21
Add jpeg to NV21 conversions, unittests and conversions
for I444, I422, I420 and I420 to NV21 needed for internals.

Bug: libyuv:820
Change-Id: Idf0f15f91307e80a82cd23943f6eed5508f13fe2
Tested: out/Release/libyuv_unittest --sandbox_unittests --gtest_filter=*MJ*
Reviewed-on: https://chromium-review.googlesource.com/c/1297710
Reviewed-by: Johann Koenig <johannkoenig@google.com>
2018-10-24 22:01:13 +00:00
Frank Barchard
97b3990dec NV21ToRAW and NV12ToRAW functions added
RAW is a big endian style RGB buffer with R first in memory, then G and B.
Convert NV21 and NV12 to RAW format.

Performance on SkylakeX for 720p with AVX2
I420ToRAW_Opt (388 ms)
H420ToRAW_Opt (371 ms)
NV12ToRAW_Opt (341 ms)
NV21ToRAW_Opt (339 ms)

SSSE3
I420ToRAW_Opt (507 ms)
H420ToRAW_Opt (481 ms)
NV12ToRAW_Opt (498 ms)
NV21ToRAW_Opt (493 ms)

C
I420ToRAW_Opt (2287 ms)
H420ToRAW_Opt (2246 ms)
NV12ToRAW_Opt (2191 ms)
NV21ToRAW_Opt (2204 ms)

Performance on Pixel 2 for 720p
out/Release/bin/run_libyuv_unittest -v -t 7200 --gtest_filter=*NV??ToR*Opt --libyuv_repeat=1000 --libyuv_width=1280 --libyuv_height=720
LibYUVConvertTest.NV12ToRGB24_Opt (1739 ms)
LibYUVConvertTest.NV21ToRGB24_Opt (1734 ms)
LibYUVConvertTest.NV12ToRAW_Opt (1719 ms)
LibYUVConvertTest.NV21ToRAW_Opt (1691 ms)
LibYUVConvertTest.NV12ToRGB565_Opt (2152 ms)

Bug: libyuv:778, b:117522975
Test: add new NV21ToRAW and NV12ToRAW tests
Change-Id: Ieabb68a2c6d8c26743e609c5696c81bb14fb253f
Reviewed-on: https://chromium-review.googlesource.com/c/1272615
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
2018-10-10 18:11:10 +00:00
Frank Barchard
20bf569a04 Fix ConvertToI420() for odd crop_y
The original src_u calculation of FOURCC_I420 shifted half width if
crop_y is odd.
This CL fixs the problem and also add a test case for it.

Bug: b:115278653
Test: pass libyuv_unittest
Change-Id: Ia9732d22e64e13de26df47726ba44ad1c5a06484
Reviewed-on: https://chromium-review.googlesource.com/c/1258743
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-10-03 19:14:01 +00:00
Frank Barchard
9a07219dc8 Documentation update for GYP and envionment variables
Bug: libyuv:816, libyuv:804
Change-Id: I73a6960b2cc6f3ca31c43c44ccd8b01f5e9e7013
Test" Untested
Reviewed-on: https://chromium-review.googlesource.com/1205053
Reviewed-by: Nico Weber <thakis@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2018-09-04 19:45:41 +00:00
Frank Barchard
4e666c4354 Add H420ToRGB565 and J420ToRGB565 unittests
Bug: libyuv:812
Test: LibYUVConvertTest.H420ToRGB565_Opt
Change-Id: Ie85ece74e0bc2b5f789cfcde76703fff6474c0e0
Reviewed-on: https://chromium-review.googlesource.com/1171380
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-08-10 21:08:46 +00:00
Frank Barchard
57de382902 MMI ifdef guards and add source to various build files.
Bug: libyuv:810,libyuv:811
Test: cmake . && make
Change-Id: I521b45ccb6e49ff70823e415efa99fc5b9daad99
Reviewed-on: https://chromium-review.googlesource.com/1162503
Reviewed-by: Johann Koenig <johannkoenig@google.com>
2018-08-03 18:37:23 +00:00
Frank Barchard
55f5d91f11 Disable old int types by default.
Legacy types can cause build errors with code that defines
them differently.  Disable them by default.  Allow the types
to be enabled with #define LIBYUV_LEGACY_TYPES

BUG=libyuv:808
TESTED=libyuv try bots still build

Change-Id: I48928329393f44a377cec781e645570b14569668
Reviewed-on: https://chromium-review.googlesource.com/1129558
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-07-09 21:16:47 +00:00
Frank Barchard
9ac881f4aa msa use void * for loads
the built in __msa_ld_b() expects a void * without const.
Cast pointers to void * to avoid build warning.

TBR=johannkoenig@google.com
Bug: libyuv:805
Change-Id: Iabc4820ecf4a3a7dcb0063e67ce276ae2a4f0501
Tested: gn gen out/Release "--args=is_debug=false target_os=\"android\" target_cpu=\"mips64el\" mips_arch_variant=\"r6\" mips_use_msa=true is_component_build=true is_clang=true"
ninja -v -C out/Release libyuv_unittest
Reviewed-on: https://chromium-review.googlesource.com/1125400
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-07-04 00:24:19 +00:00
Frank Barchard
4d67b3e851 Add H420 and H422 to ConvertToARGB()
H420/H422 are bt.720 variants

TBR=braveyao@chromium.org
BUG=libyuv:799
TESTED=try bots tested build on all platforms

Change-Id: I007d8981d91ca0748c59403759109bbcd88f286c
Reviewed-on: https://chromium-review.googlesource.com/1115719
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-06-26 22:52:42 +00:00
Frank Barchard
083aa718b9 Add AR30 and AB30 to ConvertToARGB() and fix negative NV12 height
BUG=libyuv:799
TESTED=try bots build

Change-Id: Ib4ce8d928069445a710c1e30ea85d9dccc820b6c
Reviewed-on: https://chromium-review.googlesource.com/1097561
Reviewed-by: Miguel Casas <mcasas@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2018-06-12 19:04:40 +00:00
Frank Barchard
a7fb978e30 ARGBExtractAlphaRow_Any_AVX2 fix pixel count mask
Mask was set to 32, but should have been 31.
BUG=libyuv:798
TESTED=try bots tested

Change-Id: I6120928873a4a2f1efef907d8e8296ca8c20bb03
Reviewed-on: https://chromium-review.googlesource.com/1054830
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-05-11 07:13:58 +00:00
Frank Barchard
7e5e12757b use attribute to alias for punning float to int
Bug: libyuv:791
Test: g++ -Iinclude -I../libvpx/third_party/libwebm -I../libvpx/vp8 -I../libvpx/vp8 -I../libvpx/vp9 -I../libvpx/vp9 -Iinclude -m64 -DNDEBUG -O3 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=0 -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -Wall -Wdisabled-optimization -Wfloat-conversion -Wpointer-arith -Wtype-limits -Wcast-qual -Wvla -Wuninitialized -Wunused -Wextra -I. -I"../libvpx" -D__STDC_CONSTANT_MACROS -D__STDC_LIMIT_MACROS -Wno-unused-parameter -c -o third_party/libyuv/source/row_common.cc.o source/row_common.cc
Change-Id: Ia006cb9212b671ae668cab5ec0b29759024a2c8a
Reviewed-on: https://chromium-review.googlesource.com/1012462
Reviewed-by: Johann Koenig <johannkoenig@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2018-04-13 19:20:52 +00:00
Frank Barchard
a9626b9daf Disable AVX512 for iOS simulator xcode 9 builds.
iOS simulator has the option to build with xcode instead of clang.
GN use_xcode_clang=true enables the xcode build.
As of version Xcode 9.2, the clang version used does not support
AVX512.  The version reported is version 9, but for normal clang,
version 7 is sufficient to AVX512.
When a version of XCode does support AVX512, the version check can
be updated to allow AVX512 for newer versions of XCode.
with XCode 9.2 the following macro is set.
__APPLE_CC__ 6000

Bug: libyuv:789
Test: gn gen out/Release "--args=is_debug=false target_os=\"ios\" ios_enable_code_signing=false target_cpu=\"x86\" use_xcode_clang=true"
Change-Id: I5a9a0b4a2760c7d09a4bcb464b3668979113b07e
Reviewed-on: https://chromium-review.googlesource.com/991595
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2018-04-03 18:45:14 +00:00
Frank Barchard
4ad33344cf Pass float parameters via vector 2 float and "w" for scalar multiply.
Scalar multiply expects a 'd' register.  The "w" (float) uses 's' for float
and wont work with the multiply in 32 bit (it does in 64 bit).
A vector 2 of float passes as 'd' register.
A vector 4 of float passes as 'q' register.
This change copies the float into the first entry of a vector 2
and passes that.  The optimizer removes the extra copy, allowing
the single float to use referenced as

Test: LibYUVPlanarTest.TestByteToFloat
Bug: libyuv:786
Change-Id: I8773c5bae043c7b84e1d1db7fdea6731aa0b1323
Reviewed-on: https://chromium-review.googlesource.com/973984
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Miguel Casas <mcasas@chromium.org>
2018-03-28 21:52:08 +00:00
Frank Barchard
548ec65656 Require clang 6 for AVX512 support
row.h adds CLANG_HAS_AVX512
function ifdefs in row.h for avx512
source code ifdefed function by function for
avx512 and avx2.

Bug: libyuv:778
Test: LibYUVConvertTest.NV21ToRGB24_Opt
Change-Id: If32b51459685d0d5785c5c1e94c8f668f8e74b55
Reviewed-on: https://chromium-review.googlesource.com/982402
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2018-03-28 02:38:39 +00:00
Frank Barchard
9d70f13c8f cpuid sandbox friendlier avoiding getenv()
Move getenv to unittest.cc to allow libyuv to be
run in sandbox for x86, x64 and aarch64

Bug: libyuv:767
Test: unittests still run and respect environment variables
Change-Id: I84cb1717977828776142b51c029774b3e6b142a3
Reviewed-on: https://chromium-review.googlesource.com/969645
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2018-03-20 01:04:30 +00:00
Frank Barchard
83aa7512c1 AVX512 VMBI version of ARGBToRGB24
Use VMBI instructions but on AVX2 registers to avoid clockrate change.

Bug: libyuv:778
Test: LibYUVConvertTest.NV21ToRGB24_Opt
Change-Id: Id4f8ad1e0e142a380c8a46c5eab90ce145a10edd
Reviewed-on: https://chromium-review.googlesource.com/956609
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2018-03-10 02:04:48 +00:00
Frank Barchard
29383c8b03 switch to static_assert for clang-tidy
Bug: None
Test: try bots and lint pass
Change-Id: I7429b394c89450c13732205dae672793e4bb6f44
Reviewed-on: https://chromium-review.googlesource.com/939844
Reviewed-by: Noah Richards <noahric@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2018-02-27 19:59:56 +00:00
Frank Barchard
368ac76acf clang-tidy fix for MJPEGToI420 and MJPEGToARGB
Make parameters match in the code to the header.

TBR=braveyao@chromium.org
Bug: libyuv:782
Test: try bots still build
Change-Id: Id53fa2fe988aee5e125d87bc5fe70cce6b275403
Reviewed-on: https://chromium-review.googlesource.com/938948
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-02-27 08:37:55 +00:00
Frank Barchard
0ea50cbc74 NV21ToRGB24_NEON conversion
32 bit thumb2 performance:
NV12ToARGB_Opt (472 ms)
NV21ToARGB_Opt (466 ms)
NV12ToRGB24_Opt (457 ms)
NV21ToRGB24_Opt (457 ms)
NV12ToRGB565_Opt (501 ms)

Bug: libyuv:778
Test: add new NV21ToRGB24 test
Change-Id: I330585789835c79ee4b4da61d164716598268df3
Reviewed-on: https://chromium-review.googlesource.com/924646
Reviewed-by: Cheng Wang <wangcheng@google.com>
2018-02-22 22:24:24 +00:00
Frank Barchard
18c9ab106c Rotate ARGB using scale_row.h header
ARGB rotation using scaling code.  Previously it had forward
declarations of the low level row functions used.  This CL
uses the header and hooks up Any and MSA versions of the code.

Bug: libyuv:779
Test: perf record out/Release/libyuv_unittest --gtest_filter=*ARGBRotate90_Opt --libyuv_width=640 --libyuv_height=359 --libyuv_repeat=999
Change-Id: Ifacd58b26bb17a236181a404fad589fd2543b911
Reviewed-on: https://chromium-review.googlesource.com/927530
Reviewed-by: Weiyong Yao <braveyao@chromium.org>
2018-02-21 00:53:53 +00:00
Frank Barchard
9c9215b218 End swap 10 bit RGB
Bug: libyuv:777
Test: None
Change-Id: I69b81f51c50d7739cfdb3cfb0c3d315c32bd63d2
Reviewed-on: https://chromium-review.googlesource.com/923042
Reviewed-by: Miguel Casas <mcasas@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2018-02-15 23:50:40 +00:00
Frank Barchard
6630558875 10 bit YUV to 10 bit BGR
BGR variation of 10 bit conversion using swapped U and V
and mirrored matrix to produce AB30 format instead of AR30.

Bug: libyuv:777
Test: LibYUVConvertTest.H010ToAB30_Opt
Change-Id: I96d115a5d1e12138f40cb548871e03aa3ab210eb
Reviewed-on: https://chromium-review.googlesource.com/922284
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Miguel Casas <mcasas@chromium.org>
2018-02-15 22:44:36 +00:00
Frank Barchard
e1f6c1c0b5 tidy applied with readability-inconsistent-declaration-parameter-name
Bug: libyuv:750
Test: builds and runs and passes more tidy tests
Change-Id: I023699a7aa61ea3f5e4a21647112691ea5739281
Reviewed-on: https://chromium-review.googlesource.com/902170
Reviewed-by: Weiyong Yao <braveyao@chromium.org>
2018-02-07 00:24:25 +00:00
Frank Barchard
5790a765b9 I422ToUYVYRow_AVX2 use vpmovzxbd instead of vpermq
I422ToUYVYRow_AVX2 optimized from 7 cycles per 32 pixels to 4.6 cycles.
Instead of 2 vpermq and vpunpcklbw:
vmovdqu    (%1),%%xmm2
vmovdqu    0x00(%1,%2,1),%%xmm3
vpermq     $0xd8,%%ymm2,%%ymm2
vpermq     $0xd8,%%ymm3,%%ymm3
vpunpcklbw %%ymm3,%%ymm2,%%ymm2

..use vpmovzxbd to expand the bytes to shorts, then vpslld and vpor
vpmovzxbd  (%1),%%ymm2
vpmovzxbd  0x00(%1,%2,1),%%ymm3
vpslld     $0x10,%%ymm3,%%ymm3
vpor       %%ymm3,%%ymm2,%%ymm2
which reduces the port 5 bottleneck by 1 cycle.

Bug: libyuv:556
Test: out/Release/libyuv_unittest --gtest_filter=*I42?To*UY*Opt

Change-Id: I53799e53cc6b090a1a695c839094c193be3eecaf
Reviewed-on: https://chromium-review.googlesource.com/899873
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: Cheng Wang <wangcheng@google.com>
2018-02-02 23:57:35 +00:00
Frank Barchard
664c735677 I420ToYUY2_AVX2 port
I420 and I422 To YUY2 and UYVY ported from SSE2 to AVX2.

Was SSE2
I420ToYUY2_Opt (135 ms)
I420ToUYVY_Opt (148 ms)
I422ToYUY2_Opt (145 ms)
I422ToUYVY_Opt (142 ms)

Now AVX2
I420ToYUY2_Opt (133 ms)
I420ToUYVY_Opt (130 ms)
I422ToYUY2_Opt (127 ms)
I422ToUYVY_Opt (137 ms)

Bug: libyuv:556
Test: out/Release/libyuv_unittest --sandbox_unittests --gtest_filter=*I42?To*UY*Opt
Change-Id: Ic35f97cee02dc009fd98785589ba17c7cf50bb35
Reviewed-on: https://chromium-review.googlesource.com/892493
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2018-02-01 00:33:25 +00:00
Frank Barchard
ed96b7b2c7 AVX2 port of H010ToAR30_AVX2
Was SSSE3 H010ToAR30_Opt (635 ms)
Now AVX2  H010ToAR30_Opt (448 ms)

Bug: libyuv:751
Test:  LibYUVConvertTest.H010ToAR30_Opt
Change-Id: I17b1a0e3268c4a9836e09683dd3377fb1ce60932
Reviewed-on: https://chromium-review.googlesource.com/889906
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Miguel Casas <mcasas@chromium.org>
2018-01-27 00:14:27 +00:00