1740 Commits

Author SHA1 Message Date
Frank Barchard
696e619571 RVV check __riscv_v_intrinsic version
Bug: libyuv:965
Change-Id: I9b02abd13ab3345288655fa7a16383f59cf66bb8
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4750230
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
2023-08-04 18:39:27 +00:00
Wan-Teh Chang
a8a37a25c9 Eliminate a common subexpression in YPixel()
Save the value of a common subexpression in a local variable.

Change-Id: I5724fcf341900cb2a65eb37b505194b8d3c3da9a
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4735651
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Wan-Teh Chang <wtc@google.com>
2023-07-31 20:53:54 +00:00
Bruce Lai
c60ac4025c [RVV] Enable ScaleRowDown38_RVV & ScaleRowDown38_{2,3}_Box_RVV
* Run on SiFive internal FPGA:

Test Case			Speedup
I420ScaleDownBy3by8_None	4.2
I420ScaleDownBy3by8_Linear	1.7
I420ScaleDownBy3by8_Bilinear	1.7
I420ScaleDownBy3by8_Box		1.7
I444ScaleDownBy3by8_None	4.2
I444ScaleDownBy3by8_Linear	1.8
I444ScaleDownBy3by8_Bilinear	1.8
I444ScaleDownBy3by8_Box		1.8

Change-Id: Ic2e98de2494d9e7b25f5db115a7f21c618eaefed
Signed-off-by: Bruce Lai <bruce.lai@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4711857
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2023-07-27 02:59:47 +00:00
Darren Hsieh
10de943a12 [RVV] Enable ScaleRowUp2_(Bi)linear_RVV/ScaleUVRowUp2_(Bi)linear_RVV
ScaleUVRowUp2_(Bi)linear_RVV function is equal to other platforms' ScaleRowUp2_(Bi)linear_Any_XXX.
We process entire row in this function.
Other platforms only implement non-edge part of image and process edge with scalar.
ScaleRowUp2_(Bi)linear_Any_XXX: Combine ScaleRowUp2_(Bi)linear_XXX(non-edge) + ScaleRowUp2_(Bi)linear_C(edge) by SBUH2LANY/SU2BLANY.

* Run on SiFive internal FPGA:

Test case                       RVV function			Speedup
I444ScaleFrom640x360_Bilinear	ScaleRowUp2_Bilinear_RVV	8.21
I444ScaleFrom640x360_Linear	ScaleRowUp2_Linear_RVV	        8.08
UVScaleFrom640x360_Bilinear	ScaleUVRowUp2_Bilinear_RVV	7.80
UVScaleFrom640x360_Linear	ScaleUVRowUp2_Linear_RVV	7.03

Change-Id: I539245ce51858f077506a78f0e7e82377ac6a95d
Signed-off-by: Darren Hsieh <darren.hsieh@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4666062
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-07-26 18:05:50 +00:00
Bruce Lai
d33edd2373 [RVV] Enable ARGBBlendRow_RVV/BlendPlaneRow_RVV
* Run on SiFive internal FPGA:
Test case       Speedup
ARGBBlend_Opt	4.60
BlendPlane_Opt	5.96
I420Blend_Opt	5.83

- Also, add code to use ScaleRowDown2Box_RVV in I420Blend

Change-Id: Icc75e05d26b3427a98269d2a33c4474074033264
Signed-off-by: Bruce Lai <bruce.lai@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4681100
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-07-25 16:38:55 +00:00
Darren Hsieh
aed6dbef17 [RVV] Enable NV{12,21}To{ARGB,RGB24}Row_RVV
* Run on SiFive internal FPGA(w/ -march=rv64gcv):

Test Case	Speedup
NV12ToARGB_Opt	12.0
NV21ToARGB_Opt	12.1
NV12ToABGR_Opt	12.6
NV21ToABGR_Opt	12.0
NV12ToRGB24_Opt	12.5
NV21ToRGB24_Opt	11.7
NV12ToRAW_Opt	12.1
NV21ToRAW_Opt	11.4

Change-Id: Icae2bac2b4ebbd4c5a89e847fde9a74fe6481878
Signed-off-by: Darren Hsieh <darren.hsieh@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4707804
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-07-24 17:07:01 +00:00
Frank Barchard
650be7496f Fix warnings for missing prototypes
- Add static to internal scale and rotate functions
- Remove unittest that tested an internal scale function
- Remove unused private functions
- Include missing scale_argb.h header
- Bump version and apply clang format

Bug: libyuv:830
Change-Id: I45bab0423b86334f9707f935aedd0c6efc442dd4
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4658956
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
2023-06-30 17:46:56 +00:00
Frank Barchard
a34a0ba687 ARGBExtractAlpha rename variables to match format
Bug: libyuv:956
Change-Id: I31070791754fc69b72c6dcc61be2e038d2676ed9
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4646636
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2023-06-27 03:50:35 +00:00
Bruce Lai
873d0db989 [RVV] Fix TestARGBInterpolate test fail
Root cause:
Because InterpolateRow_RVV doesn't setup rounding mode to round-to-nearest-up when y1_fraction == 128.
The rounding mode register is set to round-down in ARGBAttenuateRow_RVV.
It cause InterpolateRow_RVV(y1_fraction == 128) runs on round-down mode.
Running on round-down mode make output result differs from round-to-nearest-up mode.

Solved by: ensure to use correct rounding mode in InterpolateRow_RVV.

Also, removing unnecessary rounding mode setup in ARGBAttenuateRow_RVV.

Bug: libyuv:956
Change-Id: Ib5265d42bad76b036e42b8f91ee42a9afe1f768d
Signed-off-by: Bruce Lai <bruce.lai@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4624492
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-06-19 16:49:52 +00:00
Bruce Lai
4472b5b849 [RVV] Update ARGBAttenuateRow_RVV implementation
Bug: libyuv:956
Change-Id: Ib539c2196767e88fa6e419ed2f22d95b6deaf406
Signed-off-by: Bruce Lai <bruce.lai@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4623172
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-06-17 15:50:34 +00:00
Bruce Lai
7939e039e7 [RVV] Fix compile warning in row_rvv
1. Fix compile warning in row_rvv.cc

2. Avoid compile row_rvv.cc/scale_rvv.cc when using GCC
There is no RVV segment load & store on GCC.
Hence, avoid compiling rvv code on GCC temporarily.

3. Add several compile options to cmake build flow
  -Wno-sign-compare
  -Wno-unused-function
  -Wunused-variable
  -Wuninitialized

Bug: libyuv:956
Change-Id: I9577f98190fc9b28fb6fde65d82d0c67ce54f9ee
Signed-off-by: Bruce Lai <bruce.lai@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4615441
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2023-06-17 15:41:45 +00:00
Frank Barchard
a366ad714a ARGBAttenuate use (a + b + 255) >> 8
- Makes ARM and Intel match and fixes some off by 1 cases
- Add ARGBToUV444MatrixRow_NEON
- Add ConvertFP16ToFP32Column_NEON
- scale_rvv fix intinsic build error
- disable row_win version of ARGBAttenuate/Unattenuate

Bug: libyuv:936, libyuv:956
Change-Id: Ied99aaad3a11a8eb69212b628c58f86ec0723c38
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4617013
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-06-16 21:37:53 +00:00
Bruce Lai
04821d1e7d [RVV] Enable ARGBExtractAlphaRow/ARGBCopyYToAlphaRow
* Run on SiFive internal FPGA:

TestARGBExtractAlpha(~3.2x vs scalar)
TestARGBCopyYToAlpha(~1.6x vs scalar)

Change-Id: I36525c67e8ac3f71ea9d1a58c7dc15a4009d9da1
Signed-off-by: Bruce Lai <bruce.lai@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4617955
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2023-06-15 23:45:24 +00:00
Darren Hsieh
552571e8b2 [RVV] Enable ScaleRowDown34_RVV & ScaleRowDown34_{0,1}_Box_RVV
Run on SiFive internal FPGA:

Test case                       RVV function			Speedup
I444ScaleDownBy3by4_None	ScaleRowDown34_RVV	        5.8
I444ScaleDownBy3by4_Linear	ScaleRowDown34_0/1_Box_RVV	6.5
I444ScaleDownBy3by4_Bilinear	ScaleRowDown34_0/1_Box_RVV	6.3

Bug: libyuv:956
Change-Id: I8ef221ab14d631e14f1ba1aaa25d2b30d4e710db
Signed-off-by: Darren Hsieh <darren.hsieh@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4607777
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-06-14 00:57:00 +00:00
Frank Barchard
2a5d7e2fbc FilterRows_NEON - remove unused function - same as InterpolateRow_NEON
- Bump version to 1872
- Add scale_rvv to build files

Bug: libyuv:956
Change-Id: Ib9e9fd840a0774bd35bcdcca55a2596f33272383
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4608519
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-06-13 15:20:02 +00:00
Darren Hsieh
873eaa3bbf [RVV] Enable Scale{ARGB,UV}RowDown{2,4,EVEN}_RVV
Run on SiFive internal FPGA:

Test case               RVV function				Speedup
I444ScaleDownBy3_Box	ScaleAddRow_RVV+ScaleAddCols(scalar)	2.8
ARGBScaleDownBy2_None	ScaleARGBRowDown2_RVV		        2.2
ARGBScaleDownBy2_Linear	ScaleARGBRowDown2Linear_RVV		5.0
ARGBScaleDownBy2_Box	ScaleARGBRowDown2Box_RVV		4.3
ARGBScaleDownBy4_None	ScaleARGBRowDownEven_RVV		1.2
ARGBScaleDownBy8_Box	ScaleARGBRowDownEvenBox_RVV		3.2
ARGBScaleDownBy4_Box	ScaleARGBRowDown2Box_RVV		4.5
I444ScaleDownBy2_None	ScaleRowDown2_RVV			5.8
I444ScaleDownBy2_Linear	ScaleRowDown2Linear_RVV			6.1
I444ScaleDownBy2_Box	ScaleRowDown2Box_RVV			5.0
I444ScaleDownBy4_None	ScaleRowDown4_RVV			3.6
I444ScaleDownBy4_Box	ScaleRowDown4Box_RVV			3.5
UVScaleDownBy2_None	ScaleUVRowDown2_RVV			5.8
UVScaleDownBy2_Linear	ScaleUVRowDown2Linear_RVV		5.6
UVScaleDownBy2_Box	ScaleUVRowDown2Box_RVV			4.1
UVScaleDownBy4_None	ScaleUVRowDown4_RVV			1.7
UVScaleDownBy4_Box	ScaleUVRowDown2Box_RVV			4.5
						avg-speedup:    4

Note: Specialize ScaleUVRowDown with step_size=4 by ScaleUVRowDown4_RVV.

Bug: libyuv:956
Change-Id: If9604a6aadf681193f282507602c57c726332202
Signed-off-by: Darren Hsieh <darren.hsieh@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4601684
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-06-13 00:40:39 +00:00
Frank Barchard
b08ccb6a83 FP16 to FP32 float conversion row function
Bug: None
Change-Id: I97aab6aafd41c3bf36bfbf33fdcc424e5b3fd6e3
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4590225
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2023-06-07 00:02:40 +00:00
Frank Barchard
157b153b60 Fix tidy warning that uint32_t dither4 should not be const
- Remove const from uint32_t dither4 parameter to fix clang-tidy warning
- Apply clang format
- Bump version
- Remove unused MMI source; superceded by MSA

Bug: None
Change-Id: Id49991db25bca4e99590b415312542d917471c62
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4581882
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2023-06-02 00:42:02 +00:00
Vignesh Venkatasubramanian
c0f64c14ca Add I412/I212 to I420 functions
They re-use the same method as I410/I210 to I420 with a depth
value of 12 instead of 10.

Bug: b/268505204
Change-Id: I299862b4556461d8c95f0fc1dcd5260e1c1f25cd
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4581867
Commit-Queue: Vignesh Venkatasubramanian <vigneshv@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2023-06-01 19:50:16 +00:00
Bruce Lai
4b6373d189 [RVV] Use LMUL=2 for I4{44,22}To{ARGB,RGB24,RGBA} conversion
Replace vv+m1(LMUL=1) with vx+m2(LMUL=2).
Some kernels' asm code might contain register spill(1~2).

Change-Id: Ie3655f250d17f37c1ba9039474ece43ede98ede0
Signed-off-by: Bruce Lai <bruce.lai@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4573159
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2023-05-30 09:42:10 +00:00
Darren Hsieh
d14bd701c8 [RVV] Enable CopyRow_RVV, InterpolateRow_RVV, {Merge,Split}UVRow_RVV
* Run on SiFive internal FPGA:

MergeUVPlane_Opt(~6x vs scalar)
SplitUVPlane_Opt(~6x vs scalar)
TestCopyPlane(~8x vs scalar)
ARGBInterpolate0_Opt(~10x vs scalar)
ARGBInterpolate64_Opt(~9x vs scalar)
ARGBInterpolate168_Opt(~9x vs scalar)
ARGBInterpolate192_Opt(~8.5x vs scalar)
ARGBInterpolate255_Opt(~8x vs scalar)

Bug: libyuv:956
Change-Id: I8372341865f75f42e30371ef943d5c2e4be7b79a
Signed-off-by: Darren Hsieh <darren.hsieh@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4574186
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-05-30 09:10:35 +00:00
Frank Barchard
78d168054b Remove extraneous quote from clobber list
Bug: None
Change-Id: Ie20574d0f9c8c2f074247405b294b49c3406448d
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4568770
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
2023-05-30 09:03:05 +00:00
Justin Green
0e111d2c58 Wrap neon registers in {} for the neon MT2T unpack implementation. Some compilers throw a syntax error otherwise.
Change-Id: Ic169dcfe4d9bb9bf6d0dcae977d6cf510a7a60bf
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4568904
Commit-Queue: Justin Green <greenjustin@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2023-05-26 17:12:02 +00:00
Frank Barchard
22c7a51452 Fix SplitRGB clobber list to include all registers used
Bug:  None
Change-Id: Icac4becb0537903ab87495fb0e2a2b750e1eca4f
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4563355
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: David Gao <davidgao@google.com>
2023-05-24 21:44:59 +00:00
Wan-Teh Chang
dcbe082070 Save boxwidth - minboxwidth in a local variable
Avoid repetitions of the expression boxwidth - minboxwidth.

Change-Id: Ib53fb6b06a926b80ff9a64cc5d499aeef0894c99
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4408062
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2023-05-22 19:10:13 +00:00
Bruce Lai
de3e7fd147 Manually remove rounding value inside yb(yuvconstant) in row_rvv.cc
After libyuv:961 is completed, yb(yuvconstant) will no longer contain rounding bias +32 for fixed-point.
This CL removes rounding bias(-32) manmually in row_rvv.cc.
Hence, all fixed-point related codes' rounding mode is changed to round-to-nearest-up "0" in row_rvv.cc.

Also, replace vwmul+vnsrl w/ vmulh in I400ToARGBRow_RVV.

Bug: libyuv:956, libyuv:961
Change-Id: I10e34668a2332e38393e9d68414f07aafb6c7cf7
Signed-off-by: Bruce Lai <bruce.lai@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4550591
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2023-05-22 18:15:27 +00:00
Wan-Teh Chang
179b0203e5 Enable {J400/I400}ToARGBRow_RVV
Run on SiFive internal FPGA*:

I400ToARGB_Opt (~8x vs scalar)
J400ToARGB_Opt (~10x vs scalar)

LIBYUV_WIDTH=1280 LIBYUV_HEIGHT=720 LIBYUV_REPEAT=10

Bug: libyuv:956, libyuv:961
Change-Id: If4e21ec85c4ff79083ec16a6faae0e457129a8de
Signed-off-by: Bruce Lai <bruce.lai@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4544972
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Wan-Teh Chang <wtc@google.com>
2023-05-20 23:29:33 +00:00
Lu Wang
8670bcf17f Optimize the following 19 functions with LSX in row_lsx.cc.
UYVYToYRow_LSX, UYVYToUVRow_LSX, UYVYToUV422Row_LSX,
ARGBToUVRow_LSX, ARGBToRGB24Row_LSX, ARGBToRAWRow_LSX,
ARGBToRGB565Row_LSX, ARGBToARGB1555Row_LSX, ARGBToARGB4444Row_LSX,
ARGBToUV444Row_LSX, ARGBMultiplyRow_LSX, ARGBAddRow_LSX,
ARGBSubtractRow_LSX, ARGBAttenuateRow_LSX, ARGBToRGB565DitherRow_LSX,
ARGBShuffleRow_LSX, ARGBShadeRow_LSX, ARGBGrayRow_LSX,
ARGBSepiaRow_LSX

Bug: libyuv:913
Change-Id: I02c0c9d68b229c4a66c96837e9b928c2f5dda1f3
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4546814
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-05-19 18:55:58 +00:00
Frank Barchard
a37799344d ARGBToI420Alpha function to convert ARGB to I420 with Alpha
Bug: b/281866362
Change-Id: Ic1093a887fb483f134c78909cf1ee7495e7345ba
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4534100
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2023-05-17 00:23:24 +00:00
Bruce Lai
11d4536002 Enable I{422,444}AlphaToARGBRow_RVV & ARGBAttentuateRow_RVV
Run on SiFive internal FPGA:

I444AlphaToARGB_Opt (~16x vs scalar)
I422AlphaToARGB_Opt (~10x vs scalar)
ARGBAttenuate_Opt (~3x vs scalar)

LIBYUV_WIDTH=1280 LIBYUV_HEIGHT=720 LIBYUV_REPEAT=10

Change-Id: I0046eb7af8104bc8e13cee1cb91a19f90940d5b0
Signed-off-by: Bruce Lai <bruce.lai@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4535657
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2023-05-16 19:20:49 +00:00
Frank Barchard
6a68b18a96 Bump version and apply clang format
Bug: libyuv:956
Change-Id: I2375a02583789af2a5f13f8dba6c663d5975aaa9
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4522352
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-05-11 11:27:28 +00:00
Bruce Lai
59eae49f17 Enable ARGBToYMatrixRow_RVV/RGBAToYMatrixRow_RVV/RGBToYMatrixRow_RVV
Run on SiFive internal FPGA:

ARGBToJ400_Opt (~6x vs scalar)
RGBAToJ400_Opt (~6x vs scalar)
RGB24ToJ400_Opt (~5.5x vs scalar)

LIBYUV_WIDTH=1280 LIBYUV_HEIGHT=720 LIBYUV_REPEAT=10

Change-Id: Ia3ce8cea7962fbd8618cc23e850a7913c9cabf4f
Signed-off-by: Bruce Lai <bruce.lai@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4521783
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-05-11 10:17:51 +00:00
Darren Hsieh
497ea35688 Enable I444To{ARGB,RGB24}Row_RVV
Run on SiFive internal FPGA:

I444ToARGB_Opt (~16x vs scalar)
I444ToRGB24_Opt (~10x vs scalar)

LIBYUV_WIDTH=1280 LIBYUV_HEIGHT=720 LIBYUV_REPEAT=10

Change-Id: Idae7dc46ef648beaa14b58ba3eb56b67b17c9b3b
Signed-off-by: Darren Hsieh <darren.hsieh@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4520761
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2023-05-10 19:50:56 +00:00
Darren Hsieh
964d963afb Enable I422To{ARGB,RGBA,RGB24}Row_RVV
Run on SiFive internal FPGA:

I422ToARGB_Opt (~10x vs scalar)
I422ToRGBA_Opt (~10x vs scalar)
I420ToRGB24_Opt (~8x vs scalar)

LIBYUV_WIDTH=1280 LIBYUV_HEIGHT=720 LIBYUV_REPEAT=10

This CL manually sets rounding mode,
since we use fixed-point vector narrowing clip.
There is no definition about default value for fixed-point rounding mode.
https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#38-vector-fixed-point-rounding-mode-register-vxrm
The behavior could be different on differet paltforms. To avoid unexpected behavior, we set rounding mode manually.

Change-Id: I90f0dcb90c37f7da7caab8eb1df6c9c7a3c874a8
Signed-off-by: Darren Hsieh <darren.hsieh@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4512373
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2023-05-10 00:29:20 +00:00
Lu Wang
1d940cc570 Optimize the following functions with LSX.
MirrorRow_LSX, MirrorUVRow_LSX, ARGBMirrorRow_LSX,
I422ToYUY2Row_LSX, I422ToUYVYRow_LSX, I422ToARGBRow_LSX,
I422ToRGBARow_LSX, I422AlphaToARGBRow_LSX, I422ToRGB24Row_LSX,
I422ToRGB565Row_LSX, I422ToARGB4444Row_LSX, I422ToARGB1555Row_LSX,
YUY2ToYRow_LSX, YUY2ToUVRow_LSX, YUY2ToUV422Row_LSX

Bug: libyuv:913
Change-Id: I46cec605001d7ddd73846eed6d0a77f936b6dc53
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4515191
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2023-05-10 00:25:48 +00:00
James Zern
b372510c56 row_win.cc: fix ARM64EC build
include intrin.h rather than emmintrin.h; fixes:
C:\...\VC\Tools\MSVC\14.35.32215\include\emmintrin.h(28,1):
fatal  error C1189: #error:  this header should only be included through

Change-Id: Ief9c81f6f1971e552c8aac301d678b64fe5bd7cc
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4513825
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2023-05-09 19:56:35 +00:00
shaodiwei
4c209d264d MergeUVRow_AVX2 implementation is consistent in row_win.cc and row_gcc.cc,the commit can fix memory is wrote out of bounds
Change-Id: I4b771a46fc853effc4c0fa3ae8032322a8369dc9
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4514810
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-05-09 18:54:25 +00:00
Bruce Lai
f4bd840794 Fix compile error for riscv scalar & simplify cmake cross build flow
1. Fix compile error when build riscv without using vector

2. Fix run_qemu.sh misused v=true for USE_RVV=OFF case

3. [cmake] Fix warning by rename TEST to UNIT_TEST
Warning log:
CMake Warning (dev) at CMakeLists.txt:57 (if):                                                                                                                                                                                                                  [54/1931]
  Policy CMP0064 is not set: Support new TEST if() operator.  Run "cmake
  --help-policy CMP0064" for policy details.  Use the cmake_policy command to
  set the policy and suppress this warning.

  TEST will be interpreted as an operator when the policy is set to NEW.
  Since the policy is not set the OLD behavior will be used.
This warning is for project developers.  Use -Wno-dev to suppress it.

4. [cmake] Simplify logic for cross-build

Bug: libyuv:956

Change-Id: I120402fc7d6d86403e7d974180b81f4f9c663e36
Signed-off-by: Bruce Lai <bruce.lai@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4486239
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-05-04 18:09:00 +00:00
Bruce Lai
8811ad8ba1 Fix TestLinuxRVV test fail
Fail log:
[ RUN      ] LibYUVBaseTest.TestLinuxRVV
Note: testing to load "../../unit_test/testdata/riscv64.txt"
/scratch/brucel/libyuv/src/unit_test/cpu_test.cc:290: Failure
Expected equality of these values:
  kCpuHasRVV | kCpuHasRVVZVFH
    Which is: 1610612736
  RiscvCpuCaps("../../unit_test/testdata/riscv64_rvv_zvfh.txt")
    Which is: 536870912
[  FAILED  ] LibYUVBaseTest.TestLinuxRVV (17 ms)

Reason:
The root cause is "\n"  may be contained in the ext variable.
The last of extension substring contains "\n".
For instance, test case riscv64_rvv_zvfh.txt, the last substring is "zvfh\n" instead of "zvfh".
Solved this failure by removing "\n" which is at the end of line.

NOTE: We avoid using strstr() to solve the problem here.
Becasue using strstr() will violate the parsing rule, if future extension contains "zvfh"(e.g zvfhxxx).

Log after modification:
[ RUN      ] LibYUVBaseTest.TestLinuxRVV
Note: testing to load "../../unit_test/testdata/riscv64.txt"
[       OK ] LibYUVBaseTest.TestLinuxRVV (38 ms)

Change-Id: I7b7db98dbc5388cbc148423da6892b8f0be64599
Signed-off-by: Bruce Lai <bruce.lai@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4498101
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-05-04 03:26:25 +00:00
Darren Hsieh
1b3c4c12d4 Add Split/Merge RGB/ARGB/XRGB Row_RVV
* Run on SiFive internal FPGA:

SplitRGBPlane_Opt (~6.87x vs scalar)

SplitARGBPlane_Opt (~10.77x vs scalar)

SplitXRGBPlane_Opt (~18.69x vs scalar)

MergeRGBPlane_Opt (~3.63x vs scalar)

MergeARGBPlane_Opt (~3.50x vs scalar)

MergeXRGBPlane_Opt (~2.90x vs scalar)

LIBYUV_WIDTH=1280 LIBYUV_HEIGHT=720 LIBYUV_REPEAT=10

- include a fix to avoid implict conversion warning between size_t & int.

Bug: libyuv:956

Change-Id: Icd79b282b04ea3981e7fd4e6d547da6708d82516
Signed-off-by: Darren Hsieh <darren.hsieh@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4443411
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2023-04-28 18:34:46 +00:00
Frank Barchard
7c6a7e5737 cpuid for arm/mips/riscv initialize buffer
- change cpu printf to hex to better show flags

util/cpuid:
Cpu Flags 0x30000001
Has RISCV 0x10000000
Has RVV 0x20000000


[ RUN      ] LibYUVBaseTest.TestCpuHas
Cpu Flags 0x30000001
Has RISCV 0x10000000
Has RVV 0x20000000
Has RVVZVFH 0x0
[       OK ] LibYUVBaseTest.TestCpuHas (1 ms)
[ RUN      ] LibYUVBaseTest.TestCompilerMacros
__ATOMIC_RELAXED 0
__cplusplus 201703
__clang_major__ 9999
__clang_minor__ 0
__GNUC__ 4
__GNUC_MINOR__ 2
__riscv 1
__riscv_vector 1
__clang__ 1
__llvm__ 1
__pic__ 2
INT_TYPES_DEFINED
__has_feature

Bug: libyuv:956
Change-Id: Iee4f1f34799434390e756de1e6c2c4596d82ace5
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4484957
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-04-27 22:46:27 +00:00
Frank Barchard
cf21b5ea5c Rename variables to match layout of ABGR
Bug: None
Change-Id: Ia1d596b6e108307fe042a03c34162b25152293d4
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4461967
Reviewed-by: Justin Green <greenjustin@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-04-26 16:57:33 +00:00
Bruce Lai
1330a79e9f Optimized AR64/AB64 <-> ARGB with RVV
* Run on SiFive internal FPGA:

ARGBToAR64_Opt (~13.7x vs scalar)
ARGBToAB64_Opt (~5.81x vs scalar)
AR64ToARGB_Opt (~15.8x vs scalar)
AB64ToARGB_Opt (~2.40x vs scalar)

LIBYUV_WIDTH=1280 LIBYUV_HEIGHT=720 LIBYUV_REPEAT=10

Bug: libyuv:956

Change-Id: Ida642a5077f59d25fb7c5328f671956b2293dadd
Signed-off-by: Bruce Lai <bruce.lai@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4442913
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-04-20 19:49:55 +00:00
Frank Barchard
c994782086 Enable RVV if qemu is detected
- include a fix for jpeg unittests to do at least 1 iteration
- include a fix for scale uv to only use linearup2 if filter is linear

Tested on qemu with Intel host:
[ RUN      ] LibYUVBaseTest.TestCpuHas
Cpu Flags 805306369
Has RISCV 268435456
Has RVV 536870912
Has RVVZVFH 0
Has X86 0

Bug: libyuv:956, libyuv:959, libyuv:960
Change-Id: I4a1b66f83d82ba127780f52526153d586db90111
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4429570
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Randall Bosetti <rlb@google.com>
2023-04-18 20:29:04 +00:00
Darren Hsieh
44396e6e9a Add ARGBToRAWRow_RVV, ARGBToRGB24Row_RVV, RGB24ToARGBRow_RVV
* Run on SiFive internal FPGA:

ARGBToRAW_Opt (~1.55x vs scalar)

ARGBToRGB24_Opt (~1.44x vs scalar)

RGB24ToARGB_Opt (~1.77x vs scalar)

LIBYUV_WIDTH=1280 LIBYUV_HEIGHT=720 LIBYUV_REPEAT=10

Bug: libyuv:956

Change-Id: I26722f6848cd68684d95d9a7ee06ce0416e7985d
Signed-off-by: Darren Hsieh <darren.hsieh@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4413083
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-04-13 19:33:16 +00:00
Frank Barchard
68659d0d68 UVScale down by 2 fix for C and optimize for NEON
- update cpu_id to use "re" for fopen to avoid leaking handles if a thread is started while the file is open.

Bug: libyuv:958
Change-Id: I1af9de68fce12e440e1226fc8070634ccb1bf090
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4417176
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-04-12 22:49:20 +00:00
Frank Barchard
ee3e71c7ce Any functions use memset(vin, 0, sizeof(vin)) for GCC warning fix
- Fix -Wmemset-elt-size warning for GCC
- Use vin for inputs and vout for outputs

Bug: None
Change-Id: Iefd418dc884b4d062e1fdd9215319c8838c49eaa
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4412065
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Justin Green <greenjustin@google.com>
2023-04-10 20:44:10 +00:00
Darren Hsieh
724e7aee03 Fix macro define typo in scale_uv.cc
The correct define can be found in scale_row.h

Change-Id: I633ed47006c7bd8014038493005c2d934489ff18
Signed-off-by: Darren Hsieh <darren.hsieh@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4411353
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-04-10 16:55:48 +00:00
James Zern
0200037a5a row_any,ANYDETILE: fix -Wmemset-elt-size warning
under gcc 12.2.0 using -Wall:

source/row_any.cc: In function ‘void libyuv::DetileRow_16_Any_SSE2(const
                       uint16_t*, ptrdiff_t, uint16_t*, int)’:
source/row_any.cc:2287:11: warning: ‘memset’ used with length equal to
number of elements without multiplication by element size
[-Wmemset-elt-size]
 2287 |     memset(temp, 0, 16 * BPP); /* for msan */
      |     ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~
source/row_any.cc:2308:1: note: in expansion of macro ‘ANYDETILE’
 2308 | ANYDETILE(DetileRow_16_Any_SSE2, DetileRow_16_SSE2, uint16_t, 2, 15)

This increases the memset to the full buffer size, which may not be
strictly necessary.

Change-Id: Iea2fc649990ee84ea9aa8020d6f6b25e012b18fb
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4406599
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2023-04-08 19:01:02 +00:00
Darren Hsieh
e8af6cb2e4 Add RAWToARGBRow_RVV,RAWToRGBARow_RVV,RAWToRGB24Row_RVV
* Run on SiFive internal FPGA:

RAWToARGB_Opt (~2x vs scalar)

RAWToRGBA_Opt (~2x vs scalar)

RAWToRGB24_Opt (~1.5x vs scalar)

LIBYUV_WIDTH=1280 LIBYUV_HEIGHT=720 LIBYUV_REPEAT=10

Change-Id: I21a13d646589ea2aa3822cb9225f5191068c285b
Signed-off-by: Darren Hsieh <darren.hsieh@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4408357
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-04-07 18:45:08 +00:00