- Add kCpuHasAVXVNNI flag
- Remove deprecated GFNI detect to make space.
Meteor Lake has AVX-VNNI but not AVX512
~/intelsde/sde -mtl -- blaze-bin/third_party/libyuv/libyuv_test --gunit_filter=*CpuHas
doyuv3
Note: Google Test filter = *CpuHas
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from LibYUVBaseTest
[ RUN ] LibYUVBaseTest.TestCpuHas
Cpu Flags 0x203ff1
Has X86 0x10
Has SSE2 0x20
Has SSSE3 0x40
Has SSE41 0x80
Has SSE42 0x100
Has AVX 0x200
Has AVX2 0x400
Has ERMS 0x800
Has FMA3 0x1000
Has F16C 0x2000
Has AVX512BW 0x0
Has AVX512VL 0x0
Has AVX512VNNI 0x0
Has AVX512VBMI 0x0
Has AVX512VBMI2 0x0
Has AVX512VBITALG 0x0
Has AVX512VPOPCNTDQ 0x0
HAS AVXVNNI 0x200000
Has AVXVNNIINT8 0x0
AVX-VNNI detect
- Add kCpuHasAVXVNNI flag
- Remove deprecated GFNI detect to make space.
https://bugs.chromium.org/p/libyuv/issues/detail?id=967
Meteor Lake has AVX-VNNI but not AVX512
~/intelsde/sde -mtl -- blaze-bin/third_party/libyuv/libyuv_test --gunit_filter=*CpuHas
doyuv3
Note: Google Test filter = *CpuHas
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from LibYUVBaseTest
[ RUN ] LibYUVBaseTest.TestCpuHas
Cpu Flags 0x203ff1
Has X86 0x10
Has SSE2 0x20
Has SSSE3 0x40
Has SSE41 0x80
Has SSE42 0x100
Has AVX 0x200
Has AVX2 0x400
Has ERMS 0x800
Has FMA3 0x1000
Has F16C 0x2000
Has AVX512BW 0x0
Has AVX512VL 0x0
Has AVX512VNNI 0x0
Has AVX512VBMI 0x0
Has AVX512VBMI2 0x0
Has AVX512VBITALG 0x0
Has AVX512VPOPCNTDQ 0x0
HAS AVXVNNI 0x200000
Has AVXVNNIINT8 0x0
Running on all cpus the following report avx-vnni
grep 'AVXVNNI 0x2' */*
adl/libyuv64.txt:HAS AVXVNNI 0x200000
gnr/libyuv64.txt:HAS AVXVNNI 0x200000
grr/libyuv64.txt:HAS AVXVNNI 0x200000
mtl/libyuv64.txt:HAS AVXVNNI 0x200000
rpl/libyuv64.txt:HAS AVXVNNI 0x200000
spr/libyuv64.txt:HAS AVXVNNI 0x200000
srf/libyuv64.txt:HAS AVXVNNI 0x200000
while these support avx512 vnni
grep 'VNNI 0x1' */*
clx/libyuv64.txt:Has AVX512VNNI 0x10000
cpx/libyuv64.txt:Has AVX512VNNI 0x10000
gnr/libyuv64.txt:Has AVX512VNNI 0x10000
icl/libyuv64.txt:Has AVX512VNNI 0x10000
icx/libyuv64.txt:Has AVX512VNNI 0x10000
spr/libyuv64.txt:Has AVX512VNNI 0x10000
tgl/libyuv64.txt:Has AVX512VNNI 0x10000
and these support avx-vnni-int8
grep AVXVNNIINT8.0x4 */*
grr/libyuv64.txt:Has AVXVNNIINT8 0x400000
srf/libyuv64.txt:Has AVXVNNIINT8 0x400000
Bug: libyuv:967
Change-Id: I84cd71d1b320e7c284173eb695fc1d3b72d14ddb
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4912017
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
- Add kCpuHasAVXVNNIINT8 flag
- Move mips flags up a bit to make space.
~/intelsde/sde -srf -- blaze-bin/third_party/libyuv/libyuv_test --gunit_filter=*CpuHas
Note: Google Test filter = *CpuHas
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from LibYUVBaseTest
[ RUN ] LibYUVBaseTest.TestCpuHas
Cpu Flags 0x403ff1
Has X86 0x10
Has SSE2 0x20
Has SSSE3 0x40
Has SSE41 0x80
Has SSE42 0x100
Has AVX 0x200
Has AVX2 0x400
Has ERMS 0x800
Has FMA3 0x1000
Has F16C 0x2000
Has AVX512BW 0x0
Has AVX512VL 0x0
Has AVX512VNNI 0x0
Has AVX512VBMI 0x0
Has AVX512VBMI2 0x0
Has AVX512VBITALG 0x0
Has AVX512VPOPCNTDQ 0x0
Has AVXVNNIINT8 0x400000
Has GFNI 0x0
[ OK ] LibYUVBaseTest.TestCpuHas (32 ms)
INT8 supported on srf and grr
-srf Set chip-check and CPUID for Intel(R) Sierra Forest CPU
-grr Set chip-check and CPUID for Intel(R) Grand Ridge CPU
Bug: b/303434603
Change-Id: I628007929ff0518b2b36e1469b4d9aed71a9fa8f
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4912015
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
src_width parameter is used for assertions and unused with NDEBUG.
Fix the warning treated as an error when -Wall -Wextra -Werror is used
to build that part of the code.
BUG=libyuv:967
Change-Id: I4c02ab013e8e2684b3bed5ce9693e1493d7751b9
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4905033
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Wan-Teh Chang <wtc@google.com>
Add scalar code for AR64ToAB64, ARGBToRGBA, ARGBToBGRA, ARGBToABGR, RGBAToARGB, BGRAToARGB, and ABGRToARGB.
They are originally implemented by ARGBShffle.
This CL independetly implements them, and only enables for risc-v now.
This CL also add RVV implementation for `RGBA-family <-> RGBA-family` color conversions.
* Run on SiFive internal FPGA(VLEN=128):
Test Case Speedup
AR64ToAB64_Opt x4.6
ARGBToRGBA_Opt x6
ARGBToBGRA_Opt x6
ARGBToABGR_Opt x6
RGBAToARGB_Opt x6
Change-Id: Ie0630901046084aa259699fcdeccc64170d7103f
Signed-off-by: Bruce Lai <bruce.lai@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4797451
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
- MSAN fails on most inline assembly, unaware of what the load and store instructions do.
- MSAN is also failing on row_any functions, which memcpy a correct number of pixels into a buffer that is SIMD vector sized, apply SIMD to the full vector, and then memcpy the exact number of resulting pixels to the output buffer. MSAN wants the temporary buffer to be initialized. Which genenerally is done with a memset(buf, 0, sizeof(buf)); to satisify MSAN.
- RVV may not require disabling MSAN, since row functions are all 'any' number of elements, and implementation is intrinsics.
Bug: b/297979878
Change-Id: Ic21200689c0c7d2c85bb1de3eef38570137d3d8b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4832740
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Save the value of a common subexpression in a local variable.
Change-Id: I5724fcf341900cb2a65eb37b505194b8d3c3da9a
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4735651
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Wan-Teh Chang <wtc@google.com>
ScaleUVRowUp2_(Bi)linear_RVV function is equal to other platforms' ScaleRowUp2_(Bi)linear_Any_XXX.
We process entire row in this function.
Other platforms only implement non-edge part of image and process edge with scalar.
ScaleRowUp2_(Bi)linear_Any_XXX: Combine ScaleRowUp2_(Bi)linear_XXX(non-edge) + ScaleRowUp2_(Bi)linear_C(edge) by SBUH2LANY/SU2BLANY.
* Run on SiFive internal FPGA:
Test case RVV function Speedup
I444ScaleFrom640x360_Bilinear ScaleRowUp2_Bilinear_RVV 8.21
I444ScaleFrom640x360_Linear ScaleRowUp2_Linear_RVV 8.08
UVScaleFrom640x360_Bilinear ScaleUVRowUp2_Bilinear_RVV 7.80
UVScaleFrom640x360_Linear ScaleUVRowUp2_Linear_RVV 7.03
Change-Id: I539245ce51858f077506a78f0e7e82377ac6a95d
Signed-off-by: Darren Hsieh <darren.hsieh@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4666062
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
* Run on SiFive internal FPGA:
Test case Speedup
ARGBBlend_Opt 4.60
BlendPlane_Opt 5.96
I420Blend_Opt 5.83
- Also, add code to use ScaleRowDown2Box_RVV in I420Blend
Change-Id: Icc75e05d26b3427a98269d2a33c4474074033264
Signed-off-by: Bruce Lai <bruce.lai@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4681100
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
This CL aims to support custom compiler flags.
Because upstream clang has supported to build for x280 with -mcpu=sifive-x280.
Change-Id: Ic8fbf026fe6805ac5c3422a9ccc3f53293c89570
Signed-off-by: Bruce Lai <bruce.lai@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4713191
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
This CL adds the Shipped field (and may update the
License File field) in Chromium READMEs. Changes were
automatically created, so if you disagree with any of
them (e.g. a package is used only for testing purposes
and is not shipped), comment the suggested change and
why.
See the LSC doc at go/lsc-chrome-metadata.
Bug: b:285450740
Change-Id: I69bd0f58ab3b3861498f355e5a5650dcddfa3a6f
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4666442
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Anne Redulla <aredulla@google.com>
- Add static to internal scale and rotate functions
- Remove unittest that tested an internal scale function
- Remove unused private functions
- Include missing scale_argb.h header
- Bump version and apply clang format
Bug: libyuv:830
Change-Id: I45bab0423b86334f9707f935aedd0c6efc442dd4
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4658956
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Currently, chromium has merged loongarch config file in bug:1454442,
and so we resubmit gn builds support for loongarch.
Bug: chromium:1289502
Change-Id: Iac83f5ea016945f7d9cc5f6de20d4c561bab6347
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4615589
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Src/build project has merged loongarch config file, but DEPS not update yet.
It will leads CQ failed when tests loongarch gn builds support patch.
Bug: chromium:1289502
Change-Id: I2c5ae204e2fa3a9776b82a624b3cce08bf25216b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4614917
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Root cause:
Because InterpolateRow_RVV doesn't setup rounding mode to round-to-nearest-up when y1_fraction == 128.
The rounding mode register is set to round-down in ARGBAttenuateRow_RVV.
It cause InterpolateRow_RVV(y1_fraction == 128) runs on round-down mode.
Running on round-down mode make output result differs from round-to-nearest-up mode.
Solved by: ensure to use correct rounding mode in InterpolateRow_RVV.
Also, removing unnecessary rounding mode setup in ARGBAttenuateRow_RVV.
Bug: libyuv:956
Change-Id: Ib5265d42bad76b036e42b8f91ee42a9afe1f768d
Signed-off-by: Bruce Lai <bruce.lai@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4624492
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Bug: libyuv:956
Change-Id: Ib539c2196767e88fa6e419ed2f22d95b6deaf406
Signed-off-by: Bruce Lai <bruce.lai@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4623172
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
1. Fix compile warning in row_rvv.cc
2. Avoid compile row_rvv.cc/scale_rvv.cc when using GCC
There is no RVV segment load & store on GCC.
Hence, avoid compiling rvv code on GCC temporarily.
3. Add several compile options to cmake build flow
-Wno-sign-compare
-Wno-unused-function
-Wunused-variable
-Wuninitialized
Bug: libyuv:956
Change-Id: I9577f98190fc9b28fb6fde65d82d0c67ce54f9ee
Signed-off-by: Bruce Lai <bruce.lai@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4615441
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
- Makes ARM and Intel match and fixes some off by 1 cases
- Add ARGBToUV444MatrixRow_NEON
- Add ConvertFP16ToFP32Column_NEON
- scale_rvv fix intinsic build error
- disable row_win version of ARGBAttenuate/Unattenuate
Bug: libyuv:936, libyuv:956
Change-Id: Ied99aaad3a11a8eb69212b628c58f86ec0723c38
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4617013
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
* Run on SiFive internal FPGA:
TestARGBExtractAlpha(~3.2x vs scalar)
TestARGBCopyYToAlpha(~1.6x vs scalar)
Change-Id: I36525c67e8ac3f71ea9d1a58c7dc15a4009d9da1
Signed-off-by: Bruce Lai <bruce.lai@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4617955
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Run on SiFive internal FPGA:
Test case RVV function Speedup
I444ScaleDownBy3by4_None ScaleRowDown34_RVV 5.8
I444ScaleDownBy3by4_Linear ScaleRowDown34_0/1_Box_RVV 6.5
I444ScaleDownBy3by4_Bilinear ScaleRowDown34_0/1_Box_RVV 6.3
Bug: libyuv:956
Change-Id: I8ef221ab14d631e14f1ba1aaa25d2b30d4e710db
Signed-off-by: Darren Hsieh <darren.hsieh@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4607777
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Now that chromium/base has rolled and switched the android_ndk_root to
the new android_toolchain directory, remove the stale Android NDK. Fix
up documentation that refers to stale paths and suggest the appropriate
tools to perform objdump operations.
Bug: 1448383
Test: Verified build of LibYUV.
Change-Id: I7b674052b1ef0914cf4ee81c6c6d62410e5fc569
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4583622
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Prashanth Swaminathan <prashanthsw@google.com>
Chromium is being updated to 'android_toolchain', which means the
'android_ndk' DEPS is no longer present. Remove it from the roller until
the transition is complete, then it can be removed from this script
entirely.
Bug: 1448383
Test: Verified manual roll of libyuv.
Change-Id: I4a96e54edba9a077cb5d5214af53de5906bce8f1
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4599468
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Prashanth Swaminathan <prashanthsw@google.com>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
The Android NDK dependency is moving to a CIPD bucket to reduce the
checkout cost and to eventually move to NDK v25. This introduces the
NDK into an 'android_toolchain' directory. Following the roll of
chromium/base in this repository, a second change will delete the old
'android_ndk' checkout. As a result, the checkout size of this
repository will temporarily increase.
Bug: 1448383
Test: Verified local builds of LibYUV.
Change-Id: I35a933e2d7853b12e155c5d2b727cd4b1c5474e5
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4583617
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Uses I012ToAR30Matrix with u and v swapped and with VU suffixed
constants.
Bug: b/268505204
Change-Id: If0d189891be3053da776feb48d49fa68a9866037
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4581869
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
They re-use the same method as I410/I210 to I420 with a depth
value of 12 instead of 10.
Bug: b/268505204
Change-Id: I299862b4556461d8c95f0fc1dcd5260e1c1f25cd
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4581867
Commit-Queue: Vignesh Venkatasubramanian <vigneshv@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
This was added in the android internal master here: ag/19780505.
This keeps the upstream checkout in sync with the android
snapshot.
Bug: b/268505204
Change-Id: Ie821ebb6914c208b0cfa7127faf56ad2bcece6ac
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4581052
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Vignesh Venkatasubramanian <vigneshv@google.com>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Updates the requests version in .vpython3 to the latest available in
order to pick up a security fix. Also changes the requested version to
the Python 3-only one since Python 2 support was removed from requests.
Bug: chromium:1448265
Change-Id: I6eb4081735aee77f38793a00e9f17bdd32a52c58
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4581054
Commit-Queue: Mirko Bonadei <mbonadei@chromium.org>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>