Add jpeg to NV21 conversions, unittests and conversions
for I444, I422, I420 and I420 to NV21 needed for internals.
Bug: libyuv:820
Change-Id: Idf0f15f91307e80a82cd23943f6eed5508f13fe2
Tested: out/Release/libyuv_unittest --sandbox_unittests --gtest_filter=*MJ*
Reviewed-on: https://chromium-review.googlesource.com/c/1297710
Reviewed-by: Johann Koenig <johannkoenig@google.com>
RAW is a big endian style RGB buffer with R first in memory, then G and B.
Convert NV21 and NV12 to RAW format.
Performance on SkylakeX for 720p with AVX2
I420ToRAW_Opt (388 ms)
H420ToRAW_Opt (371 ms)
NV12ToRAW_Opt (341 ms)
NV21ToRAW_Opt (339 ms)
SSSE3
I420ToRAW_Opt (507 ms)
H420ToRAW_Opt (481 ms)
NV12ToRAW_Opt (498 ms)
NV21ToRAW_Opt (493 ms)
C
I420ToRAW_Opt (2287 ms)
H420ToRAW_Opt (2246 ms)
NV12ToRAW_Opt (2191 ms)
NV21ToRAW_Opt (2204 ms)
Performance on Pixel 2 for 720p
out/Release/bin/run_libyuv_unittest -v -t 7200 --gtest_filter=*NV??ToR*Opt --libyuv_repeat=1000 --libyuv_width=1280 --libyuv_height=720
LibYUVConvertTest.NV12ToRGB24_Opt (1739 ms)
LibYUVConvertTest.NV21ToRGB24_Opt (1734 ms)
LibYUVConvertTest.NV12ToRAW_Opt (1719 ms)
LibYUVConvertTest.NV21ToRAW_Opt (1691 ms)
LibYUVConvertTest.NV12ToRGB565_Opt (2152 ms)
Bug: libyuv:778, b:117522975
Test: add new NV21ToRAW and NV12ToRAW tests
Change-Id: Ieabb68a2c6d8c26743e609c5696c81bb14fb253f
Reviewed-on: https://chromium-review.googlesource.com/c/1272615
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
The original src_u calculation of FOURCC_I420 shifted half width if
crop_y is odd.
This CL fixs the problem and also add a test case for it.
Bug: b:115278653
Test: pass libyuv_unittest
Change-Id: Ia9732d22e64e13de26df47726ba44ad1c5a06484
Reviewed-on: https://chromium-review.googlesource.com/c/1258743
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
When loading or storing the data, the unaligned address will greatly degrade
the optimization performance, so non-aligned access instructions are required
on the loongson platform.
Also delete the optimization function:ScaleARGBFilterCols_MMI,
because it degraded the performance.
BUG=libyuv:804
R=fbarchard@chromium.org
Change-Id: If4c15886a21cdcbac7ae8b336292e4549acf1e47
Reviewed-on: https://chromium-review.googlesource.com/1164627
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
This was changed in 21be9122aadf7824efe3fc19b2a09ff253a688e1.
Change-Id: I6c04dc92f673557e10c231bd090ec8aa88b6bee4
Reviewed-on: https://chromium-review.googlesource.com/1146183
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Currently, libyuv supports MIPS SIMD Arch(MSA),
but libyuv does not supports MultiMedia Instruction(MMI)(such as loongson3a platform).
In order to improve performance of libyuv on loongson3a platform,
this provides optimize 98 functions with mmi.
BUG=libyuv:804
Change-Id: I8947626009efad769b3103a867363ece25d79629
Reviewed-on: https://chromium-review.googlesource.com/1122064
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
the built in __msa_ld_b() expects a void * without const.
Cast pointers to void * to avoid build warning.
TBR=johannkoenig@google.com
Bug: libyuv:805
Change-Id: Iabc4820ecf4a3a7dcb0063e67ce276ae2a4f0501
Tested: gn gen out/Release "--args=is_debug=false target_os=\"android\" target_cpu=\"mips64el\" mips_arch_variant=\"r6\" mips_use_msa=true is_component_build=true is_clang=true"
ninja -v -C out/Release libyuv_unittest
Reviewed-on: https://chromium-review.googlesource.com/1125400
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
H420/H422 are bt.720 variants
TBR=braveyao@chromium.org
BUG=libyuv:799
TESTED=try bots tested build on all platforms
Change-Id: I007d8981d91ca0748c59403759109bbcd88f286c
Reviewed-on: https://chromium-review.googlesource.com/1115719
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Consistently use one style of line endings for the repository
Change-Id: Idd70e3d7f3a7a6641b268a81e51eebf9c705b67d
Reviewed-on: https://chromium-review.googlesource.com/1107877
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Avoid warnings regarding loss of qualifiers:
warning: cast from type ‘const uint8_t* {aka const unsigned char*}’
to type ‘v16i8* {aka __vector(16) signed char*}’ casts away
qualifiers
BUG=libyuv:793
Change-Id: Ie0d215bc07b49285b5d06ee91ccc2c9a7979799e
Reviewed-on: https://chromium-review.googlesource.com/1107879
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
This reverts commit a8aa921c4614f9d6a0e8f3459648ca1ae75cdbe6.
Reason for revert: breaks a webrtc unittest on Windows.
https://bugs.chromium.org/p/webrtc/issues/detail?id=9263&can=2&start=0&num=100&q=&colspec=ID%20Pri%20M%20ReleaseBlock%20Component%20Status%20Owner%20Summary&groupby=&sort=
Original change's description:
> Allow negative height when ConvertToI420/ARGB is called with NV12/NV21
>
> ConvertToI420 and ConvertToARGB support the use of a negative height
> parameter to flip the image vertically. When converting from NV12 or
> NV21 this parameter was misinterpreted, resulting in invalid output.
> This CL introduces the use of abs_src_height to correctly calculate
> the location of the source UV plane.
>
> The sign of crop_height is not used, to reduce confusion ConvertToI420
> and ConvertToARGB no longer accept negative crop height.
>
> Unit tests for Android420ToI420 are updated to fix miscalculation of
> src_stride_uv, fix incorrect pixel strides, and to test inversion.
> New unit tests are included to test inversion for ConvertToARGB,
> ConvertToI420, Android420ToARGB, and Android420ToABGR.
> For consistency the test NV12Crop is renamed ConvertToI420_NV12_Crop.
>
> Bug: libyuv:446
> Test: out/Release/libyuv_unittest --gtest_filter=*.ConvertTo*:*.Android420To*
> Change-Id: Idc98e62671cb30272cfa7e24fafbc8b73712f7c6
> Reviewed-on: https://chromium-review.googlesource.com/994074
> Commit-Queue: Frank Barchard <fbarchard@chromium.org>
> Reviewed-by: Frank Barchard <fbarchard@chromium.org>
TBR=fbarchard@chromium.org,robert@bares.me
# Not skipping CQ checks because original CL landed > 1 day ago.
Bug: libyuv:446, chromium:9263, libyuv:801
Change-Id: I7c55b3fcb477f9754c249b9c2c54b24da2c29283
Reviewed-on: https://chromium-review.googlesource.com/1081267
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Weiyong Yao <braveyao@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Mask was set to 32, but should have been 31.
BUG=libyuv:798
TESTED=try bots tested
Change-Id: I6120928873a4a2f1efef907d8e8296ca8c20bb03
Reviewed-on: https://chromium-review.googlesource.com/1054830
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
ConvertToI420 and ConvertToARGB support the use of a negative height
parameter to flip the image vertically. When converting from NV12 or
NV21 this parameter was misinterpreted, resulting in invalid output.
This CL introduces the use of abs_src_height to correctly calculate
the location of the source UV plane.
The sign of crop_height is not used, to reduce confusion ConvertToI420
and ConvertToARGB no longer accept negative crop height.
Unit tests for Android420ToI420 are updated to fix miscalculation of
src_stride_uv, fix incorrect pixel strides, and to test inversion.
New unit tests are included to test inversion for ConvertToARGB,
ConvertToI420, Android420ToARGB, and Android420ToABGR.
For consistency the test NV12Crop is renamed ConvertToI420_NV12_Crop.
Bug: libyuv:446
Test: out/Release/libyuv_unittest --gtest_filter=*.ConvertTo*:*.Android420To*
Change-Id: Idc98e62671cb30272cfa7e24fafbc8b73712f7c6
Reviewed-on: https://chromium-review.googlesource.com/994074
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
When casting a const value, ensure the cast is const as well.
BUG=webm:1509
Change-Id: I5b597fdcc148d111e9824bc7cf918fc5f24e970f
Reviewed-on: https://chromium-review.googlesource.com/996553
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
iOS simulator has the option to build with xcode instead of clang.
GN use_xcode_clang=true enables the xcode build.
As of version Xcode 9.2, the clang version used does not support
AVX512. The version reported is version 9, but for normal clang,
version 7 is sufficient to AVX512.
When a version of XCode does support AVX512, the version check can
be updated to allow AVX512 for newer versions of XCode.
with XCode 9.2 the following macro is set.
__APPLE_CC__ 6000
Bug: libyuv:789
Test: gn gen out/Release "--args=is_debug=false target_os=\"ios\" ios_enable_code_signing=false target_cpu=\"x86\" use_xcode_clang=true"
Change-Id: I5a9a0b4a2760c7d09a4bcb464b3668979113b07e
Reviewed-on: https://chromium-review.googlesource.com/991595
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Bug: libyuv:789
Test: builds locally on linux with clang
Change-Id: I3000494d4b0b18f59d7852bc1bc0c9e422d2d63a
Reviewed-on: https://chromium-review.googlesource.com/987331
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Scalar multiply expects a 'd' register. The "w" (float) uses 's' for float
and wont work with the multiply in 32 bit (it does in 64 bit).
A vector 2 of float passes as 'd' register.
A vector 4 of float passes as 'q' register.
This change copies the float into the first entry of a vector 2
and passes that. The optimizer removes the extra copy, allowing
the single float to use referenced as
Test: LibYUVPlanarTest.TestByteToFloat
Bug: libyuv:786
Change-Id: I8773c5bae043c7b84e1d1db7fdea6731aa0b1323
Reviewed-on: https://chromium-review.googlesource.com/973984
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Miguel Casas <mcasas@chromium.org>
row.h adds CLANG_HAS_AVX512
function ifdefs in row.h for avx512
source code ifdefed function by function for
avx512 and avx2.
Bug: libyuv:778
Test: LibYUVConvertTest.NV21ToRGB24_Opt
Change-Id: If32b51459685d0d5785c5c1e94c8f668f8e74b55
Reviewed-on: https://chromium-review.googlesource.com/982402
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Adds a method that forces the CPU flags. Useful when using libyuv inside
a sandboxed process which may not have access to the file system.
Bug: libyuv:787
Change-Id: I01f71e39a7301085d9de388eba930b4cac0fd7be
Reviewed-on: https://chromium-review.googlesource.com/972338
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Move getenv to unittest.cc to allow libyuv to be
run in sandbox for x86, x64 and aarch64
Bug: libyuv:767
Test: unittests still run and respect environment variables
Change-Id: I84cb1717977828776142b51c029774b3e6b142a3
Reviewed-on: https://chromium-review.googlesource.com/969645
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Use VMBI instructions but on AVX2 registers to avoid clockrate change.
Bug: libyuv:778
Test: LibYUVConvertTest.NV21ToRGB24_Opt
Change-Id: Id4f8ad1e0e142a380c8a46c5eab90ce145a10edd
Reviewed-on: https://chromium-review.googlesource.com/956609
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Change unittest flags to decimal so they can be used for --libyuv_cpu_info=
Add environment variables to disable AVX 512 bits.
Bug: libyuv:784
Test: LibYUVBaseTest.TestCpuHas
Change-Id: Iea6704368fbe9f6d3395933da7993fb2a3453225
Reviewed-on: https://chromium-review.googlesource.com/957704
Reviewed-by: richard winterton <rrwinterton@gmail.com>
AVX2 port of SSSE3 conversion to output 24 bit RGB
Bug: libyuv:778
Test: LibYUVConvertTest.NV21ToRGB24_Opt
Change-Id: I14f7815522d1b790ecd2bb39d9a3441e803b694a
Reviewed-on: https://chromium-review.googlesource.com/953303
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Use 2 step conversion for NV21ToRGB24 to leverage AVX2
low levels instead of C.
Was C
NV21ToRGB24_Opt (882 ms)
Now SSSE3
NV21ToRGB24_Opt (218 ms)
Bug: libyuv:778
Test: LibYUVConvertTest.NV21ToRGB24_Opt
Change-Id: I58faf766bbec4cc595aab2e217f6c874dd4b4363
Reviewed-on: https://chromium-review.googlesource.com/951629
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Each byte is converted to float (0.0 to 255.0) and then multiplied
by a scale parameter.
Bug: None
Test: arm 64 build passes.
Change-Id: I04736798540b8d985f60abdf0388e24a209d075b
Reviewed-on: https://chromium-review.googlesource.com/930226
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Ian Field <ianfield@google.com>
ARGB rotation using scaling code. Previously it had forward
declarations of the low level row functions used. This CL
uses the header and hooks up Any and MSA versions of the code.
Bug: libyuv:779
Test: perf record out/Release/libyuv_unittest --gtest_filter=*ARGBRotate90_Opt --libyuv_width=640 --libyuv_height=359 --libyuv_repeat=999
Change-Id: Ifacd58b26bb17a236181a404fad589fd2543b911
Reviewed-on: https://chromium-review.googlesource.com/927530
Reviewed-by: Weiyong Yao <braveyao@chromium.org>
AR30ToARGB will vectorize if the output is masked
together as an int instead of 4 byte stores.
Performance is 2x faster
Was AR30ToARGB_Opt (1585 ms)
Now AR30ToARGB_Opt (746 ms)
Bug: libyuv:777
Test:LibYUVConvertTest.AR30ToARGB_Opt
Change-Id: Idd47ae599d5d125207bb53e618d6d7e784d4a37c
Reviewed-on: https://chromium-review.googlesource.com/923169
Reviewed-by: Miguel Casas <mcasas@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
BGR variation of 10 bit conversion using swapped U and V
and mirrored matrix to produce AB30 format instead of AR30.
Bug: libyuv:777
Test: LibYUVConvertTest.H010ToAB30_Opt
Change-Id: I96d115a5d1e12138f40cb548871e03aa3ab210eb
Reviewed-on: https://chromium-review.googlesource.com/922284
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Miguel Casas <mcasas@chromium.org>
This reverts commit 7b9ff4a0355c778f2cf03bdb15029d60a1259061.
Reason for revert: ios build bots are red
Original change's description:
> tidy applied with readability-*
>
> TBR=braveyao@chromium.org
> Bug: libyuv:750
> Test: builds and runs and passes more tidy tests
> Change-Id: I316822f7d13b370b88b92a693912e880b21f92c8
> Reviewed-on: https://chromium-review.googlesource.com/907371
> Reviewed-by: Frank Barchard <fbarchard@chromium.org>
TBR=fbarchard@chromium.org,braveyao@chromium.org
# Not skipping CQ checks because original CL landed > 1 day ago.
Bug: libyuv:750
Change-Id: I4a73ffee2b71664c6cb93f38f2b5d70ebd76953e
Reviewed-on: https://chromium-review.googlesource.com/912175
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Bug: libyuv:750
Test: builds and runs and passes more tidy tests
Change-Id: I023699a7aa61ea3f5e4a21647112691ea5739281
Reviewed-on: https://chromium-review.googlesource.com/902170
Reviewed-by: Weiyong Yao <braveyao@chromium.org>
I422ToUYVYRow_AVX2 optimized from 7 cycles per 32 pixels to 4.6 cycles.
Instead of 2 vpermq and vpunpcklbw:
vmovdqu (%1),%%xmm2
vmovdqu 0x00(%1,%2,1),%%xmm3
vpermq $0xd8,%%ymm2,%%ymm2
vpermq $0xd8,%%ymm3,%%ymm3
vpunpcklbw %%ymm3,%%ymm2,%%ymm2
..use vpmovzxbd to expand the bytes to shorts, then vpslld and vpor
vpmovzxbd (%1),%%ymm2
vpmovzxbd 0x00(%1,%2,1),%%ymm3
vpslld $0x10,%%ymm3,%%ymm3
vpor %%ymm3,%%ymm2,%%ymm2
which reduces the port 5 bottleneck by 1 cycle.
Bug: libyuv:556
Test: out/Release/libyuv_unittest --gtest_filter=*I42?To*UY*Opt
Change-Id: I53799e53cc6b090a1a695c839094c193be3eecaf
Reviewed-on: https://chromium-review.googlesource.com/899873
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: Cheng Wang <wangcheng@google.com>
vpshufb is used to reverse R and B channels;
Code is otherwise the same as ARGBToAR30.
Bug: libyuv:751
Test: ABGRToAR30 unittest
Change-Id: I30e02925f5c729e4496c5963ba4ba4af16633b3b
Reviewed-on: https://chromium-review.googlesource.com/891807
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
ABGR is the more common format on Android.
This CL converts 10 bit AR30, to standard 8 bit ABGR.
Unoptimized but allows better testing and feature completeness.
Bug: libyuv:751
Test: LibYUVConvertTest.AR30ToABGR_Opt
Change-Id: I0c7e7273158be215129e0a1d355587ae15942299
Reviewed-on: https://chromium-review.googlesource.com/891694
Reviewed-by: Miguel Casas <mcasas@chromium.org>